Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] Add ethtool -g support to virtio_net
From: Rusty Russell @ 2011-10-20  7:55 UTC (permalink / raw)
  To: Rick Jones, netdev, mst, virtualization
In-Reply-To: <20111019181059.C644A29003F6@tardy>

On Wed, 19 Oct 2011 11:10:59 -0700 (PDT), raj@tardy.cup.hp.com (Rick Jones) wrote:
> From: Rick Jones <rick.jones2@hp.com>
> 
> Add support for reporting ring sizes via ethtool -g to the virtio_net
> driver.
> 
> Signed-off-by: Rick Jones <rick.jones2@hp.com>

Acked-by: Rusty Russell <rusty@rustcorp.com.au>

MST, want me to take this, or do you?

Cheers,
Rusty.

^ permalink raw reply

* Re: Flow classifier proto-dst and TOS (and proto-src)
From: Dan Siemon @ 2011-10-24  1:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1318831799.2500.25.camel@edumazet-laptop>

[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]

On Mon, 2011-10-17 at 08:09 +0200, Eric Dumazet wrote:
> Le samedi 15 octobre 2011 à 12:51 -0400, Dan Siemon a écrit :
> > cls_flow.c: flow_get_proto_dst()
> > 
> > The proto-dst key returns the destination port for UDP, TCP and a few
> > other protocols [see proto_ports_offset()]. For ICMP and IPIP it falls
> > back to:
> > 
> > return addr_fold(skb_dst(skb)) ^ (__force u16)skb->protocol;
> > 
> > Since Linux maintains a dst_entry for each TOS value this causes the
> > returned value to be affected by the TOS which is unexpected and
> > probably broken.
> 
> Hi Dan
> 
> I think Patrick did this on purpose, because of of the lack of
> perturbation in cls_flow.c : If all these frames were mapped to a single
> flow, they might interfere with an other regular flow and hurt it.
> 
> I dont qualify existing code as buggy. Its about fallback behavior
> anyway (I dont think its even documented)

Thanks for the review Eric.

Won't virtually all uses of proto-dst also use the dst key anyway? In
which case this fallback does nothing except make the TOS effect the
hash output because the dst will be the same and dst_entry would be the
same if it wasn't for the different TOS (by far the common case). I
don't see the value of the unintuitive behavior.

I'm not certain this is a problem but also note that including TOS will
mean that packets within a tunnel will be reordered if 'tos inherit' is
set on the tunnel and only the typical src,dst,proto,proto-src,proto-dst
is used. Again, probably not expected.

> If you have too many frames going to the fallback, then this classifier
> is probably not the one you should use ?

If you have significant traffic in tunnels then any 5-tuple approach is
going to present problems unless you look into the tunnel (like my other
patch :) )

> Hint : You can change your filter to use this classifier only on TCP/UDP
> trafic, and use another one on other protocols : Coupled to your qdisc
> rules, you even can limit to X percent the bandwidth allocated to this
> trafic
> 
> We could argue that if TOS value of two packets is different, then
> packets belong to different flows as well. [ It seems we currently lack
> a FLOW_KEY_TOS : that could be a usefull addition ]



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [RFD] Network configuration data in sysfs
From: David Miller @ 2011-10-24  0:49 UTC (permalink / raw)
  To: kirill
  Cc: netdev, kuznet, jmorris, yoshfuji, kaber, gregkh, kay.sievers,
	gladkov.alexey
In-Reply-To: <20111023233558.GA23264@shutemov.name>

From: "Kirill A. Shutemov" <kirill@shutemov.name>
Date: Mon, 24 Oct 2011 02:35:58 +0300

> Is there something fundamental preventing us to have sysfs interface for
> network configuration?

Netlink already provides everything sysfs would potentially provide as
well as event tracking.

udev could just listen to a netlink socket and notice all changes to
addresses, routes, and device states.

^ permalink raw reply

* Re: [PATCH] ipv4: fix ipsec forward performance regression
From: Yan, Zheng @ 2011-10-24  0:41 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: netdev@vger.kernel.org, davem@davemloft.net,
	eric.dumazet@gmail.com, Kim Phillips
In-Reply-To: <alpine.LFD.2.00.1110231533410.1499@ja.ssi.bg>

On 10/23/2011 10:52 PM, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Sun, 23 Oct 2011, Yan, Zheng wrote:
> 
>> There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable).
>> It makes xfrm4_fill_dst() modify wrong data structure.
>>
>> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
>> ---
>>  net/ipv4/xfrm4_policy.c |   14 +++++++-------
>>  1 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
>> index fc5368a..a0b4c5d 100644
>> --- a/net/ipv4/xfrm4_policy.c
>> +++ b/net/ipv4/xfrm4_policy.c
>> @@ -79,13 +79,13 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
>>  	struct rtable *rt = (struct rtable *)xdst->route;
>>  	const struct flowi4 *fl4 = &fl->u.ip4;
>>  
>> -	rt->rt_key_dst = fl4->daddr;
>> -	rt->rt_key_src = fl4->saddr;
>> -	rt->rt_key_tos = fl4->flowi4_tos;
>> -	rt->rt_route_iif = fl4->flowi4_iif;
>> -	rt->rt_iif = fl4->flowi4_iif;
>> -	rt->rt_oif = fl4->flowi4_oif;
>> -	rt->rt_mark = fl4->flowi4_mark;
>> +	xdst->u.rt.rt_key_dst = fl4->daddr;
>> +	xdst->u.rt.rt_key_src = fl4->saddr;
>> +	xdst->u.rt.rt_key_tos = fl4->flowi4_tos;
>> +	xdst->u.rt.rt_route_iif = fl4->flowi4_iif;
> 
> 	May be I'm missing something but I don't see where
> flowi4_iif is set for the forwarding case. __xfrm_route_forward
> calls xfrm_decode_session which does not appear to set
> flowi4_iif. When providing fl4 for output routes flowi4_iif
> is always set to 0, so it represents rt_route_iif. But
> then there are 2 variants for __ip_route_output_key:
> 
> - ip_route_output_slow sets flowi4_iif to loopback and
> flowi4_oif to outdev during lookup but never restores them
> to original values. It is assumed that caller uses outdev
> from dst, not from flowi4_oif.
> 
> - for cached route we do not update flowi4_iif and flowi4_oif
> in __ip_route_output_key, so the resulting fl4 can not be
> used for these values. I assume, the current rules are that
> only fl4.saddr and daddr are updated while flowi4_iif and
> flowi4_oif are not. It looks wrong flowi code to rely on them.
> 
> 	Currently, we have 3 values for devices:
> 
> rt_iif: indev for input routes, resulting outdev for output routes
> which plays the role as indev for loopback traffic.
> 
> rt_oif: original outdev key, 0 for input routes, can be 0 for
> output routes if socket is not bound to oif
> 
> rt_route_iif: indev for input routes, 0 for output routes
> 
> 	With above rules for flowi4_iif and flowi4_oif
> it is impossible to select value for rt_iif from fl4.
> 
> 	I don't know the xfrm code well, may be after the

Neither do I. My understanding is that xfrm_dst(s) are managed by the
flow cache (net/core/flow.c). We don't put them into the routing cache.

Regards
Yan, Zheng 

> mentioned change we damaged rt_oif and rt_route_iif values
> for cached dst which can lead to using slow path all the time.
> Even if rt_intern_hash() avoids caching similar dsts multiple
> times, if cached entry is damaged we will add more and
> more new entries after every damage.
> 
>> +	xdst->u.rt.rt_iif = fl4->flowi4_iif;
>> +	xdst->u.rt.rt_oif = fl4->flowi4_oif;
>> +	xdst->u.rt.rt_mark = fl4->flowi4_mark;
>>  
>>  	xdst->u.dst.dev = dev;
>>  	dev_hold(dev);
> 
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [RFD] Network configuration data in sysfs
From: Kirill A. Shutemov @ 2011-10-23 23:35 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Greg Kroah-Hartman,
	Kay Sievers, Alexey Gladkov

Hi,

Currently there's no way to set or inspect network configuration (protocol
addresses, routes, etc.) through sysfs. Yes, we have netlink interface for
this, but sysfs has advantage:

- change or inspect network configuration using standard unix utilities
  (echo, cat, etc.). It's useful at least in restricted environment where
  no special utilities available -- initrd or stripped down busybox.

- transparent udev support. It would be nice to get this information to
  udev.

Is there something fundamental preventing us to have sysfs interface for
network configuration?

-- 
 Kirill A. Shutemov

^ permalink raw reply

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Or Gerlitz @ 2011-10-23 21:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, eric.dumazet, bhutchings, shemminger, fubar, andy,
	tgraf, ebiederm, mirqus, kaber, greearb, jesse, fbl,
	benjamin.poirier, jzupka
In-Reply-To: <1319200747-2508-1-git-send-email-jpirko@redhat.com>

On Fri, Oct 21, 2011 at 2:39 PM, Jiri Pirko <jpirko@redhat.com> wrote:
> This patch introduces new network device called team. It supposes to be
> very fast, simple, userspace-driven alternative to existing bonding driver.

Jiri,

Could you elaborate a little further on the motivation for this
teaming approach/solution vs. the current bonding driver? You say that
it suppose to be very fast, simple  and user space driven, so... do
you find bonding not to be fast enough? or too complex? or the fact
that bonding's driving logic being in the kernel is something you
prefer to see in user-space? anything else?

thanks,

Or.


>
> Userspace library called libteam with couple of demo apps is available
> here:
> https://github.com/jpirko/libteam
> Note it's still in its dipers atm.
>
> team<->libteam use generic netlink for communication. That and rtnl
> suppose to be the only way to configure team device, no sysfs etc.
>
> Python binding basis for libteam was recently introduced (some need
> still need to be done on it though). Daemon providing arpmon/miimon
> active-backup functionality will be introduced shortly.
> All what's necessary is already implemented in kernel team driver.
>
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
> v1->v2:
>        - modes are made as modules. Makes team more modular and
>          extendable.
>        - several commenters' nitpicks found on v1 were fixed
>        - several other bugs were fixed.
>        - note I ignored Eric's comment about roundrobin port selector
>          as Eric's way may be easily implemented as another mode (mode
>          "random") in future.
> ---
>  Documentation/networking/team.txt         |    2 +
>  MAINTAINERS                               |    7 +
>  drivers/net/Kconfig                       |    2 +
>  drivers/net/Makefile                      |    1 +
>  drivers/net/team/Kconfig                  |   38 +
>  drivers/net/team/Makefile                 |    7 +
>  drivers/net/team/team.c                   | 1593 +++++++++++++++++++++++++++++
>  drivers/net/team/team_mode_activebackup.c |  152 +++
>  drivers/net/team/team_mode_roundrobin.c   |  107 ++
>  include/linux/Kbuild                      |    1 +
>  include/linux/if.h                        |    1 +
>  include/linux/if_team.h                   |  233 +++++
>  12 files changed, 2144 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/networking/team.txt
>  create mode 100644 drivers/net/team/Kconfig
>  create mode 100644 drivers/net/team/Makefile
>  create mode 100644 drivers/net/team/team.c
>  create mode 100644 drivers/net/team/team_mode_activebackup.c
>  create mode 100644 drivers/net/team/team_mode_roundrobin.c
>  create mode 100644 include/linux/if_team.h
>
> diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
> new file mode 100644
> index 0000000..5a01368
> --- /dev/null
> +++ b/Documentation/networking/team.txt
> @@ -0,0 +1,2 @@
> +Team devices are driven from userspace via libteam library which is here:
> +       https://github.com/jpirko/libteam
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5008b08..c33400d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6372,6 +6372,13 @@ W:       http://tcp-lp-mod.sourceforge.net/
>  S:     Maintained
>  F:     net/ipv4/tcp_lp.c
>
> +TEAM DRIVER
> +M:     Jiri Pirko <jpirko@redhat.com>
> +L:     netdev@vger.kernel.org
> +S:     Supported
> +F:     drivers/net/team/
> +F:     include/linux/if_team.h
> +
>  TEGRA SUPPORT
>  M:     Colin Cross <ccross@android.com>
>  M:     Erik Gilling <konkers@android.com>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 583f66c..b3020be 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -125,6 +125,8 @@ config IFB
>          'ifb1' etc.
>          Look at the iproute2 documentation directory for usage etc
>
> +source "drivers/net/team/Kconfig"
> +
>  config MACVLAN
>        tristate "MAC-VLAN support (EXPERIMENTAL)"
>        depends on EXPERIMENTAL
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index fa877cd..4e4ebfe 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -17,6 +17,7 @@ obj-$(CONFIG_NET) += Space.o loopback.o
>  obj-$(CONFIG_NETCONSOLE) += netconsole.o
>  obj-$(CONFIG_PHYLIB) += phy/
>  obj-$(CONFIG_RIONET) += rionet.o
> +obj-$(CONFIG_NET_TEAM) += team/
>  obj-$(CONFIG_TUN) += tun.o
>  obj-$(CONFIG_VETH) += veth.o
>  obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> diff --git a/drivers/net/team/Kconfig b/drivers/net/team/Kconfig
> new file mode 100644
> index 0000000..70a43a6
> --- /dev/null
> +++ b/drivers/net/team/Kconfig
> @@ -0,0 +1,38 @@
> +menuconfig NET_TEAM
> +       tristate "Ethernet team driver support (EXPERIMENTAL)"
> +       depends on EXPERIMENTAL
> +       ---help---
> +         This allows one to create virtual interfaces that teams together
> +         multiple ethernet devices.
> +
> +         Team devices can be added using the "ip" command from the
> +         iproute2 package:
> +
> +         "ip link add link [ address MAC ] [ NAME ] type team"
> +
> +         To compile this driver as a module, choose M here: the module
> +         will be called team.
> +
> +if NET_TEAM
> +
> +config NET_TEAM_MODE_ROUNDROBIN
> +       tristate "Round-robin mode support"
> +       depends on NET_TEAM
> +       ---help---
> +         Basic mode where port used for transmitting packets is selected in
> +         round-robin fashion using packet counter.
> +
> +         To compile this team mode as a module, choose M here: the module
> +         will be called team_mode_roundrobin.
> +
> +config NET_TEAM_MODE_ACTIVEBACKUP
> +       tristate "Active-backup mode support"
> +       depends on NET_TEAM
> +       ---help---
> +         Only one port is active at a time and the rest of ports are used
> +         for backup.
> +
> +         To compile this team mode as a module, choose M here: the module
> +         will be called team_mode_activebackup.
> +
> +endif # NET_TEAM
> diff --git a/drivers/net/team/Makefile b/drivers/net/team/Makefile
> new file mode 100644
> index 0000000..85f2028
> --- /dev/null
> +++ b/drivers/net/team/Makefile
> @@ -0,0 +1,7 @@
> +#
> +# Makefile for the network team driver
> +#
> +
> +obj-$(CONFIG_NET_TEAM) += team.o
> +obj-$(CONFIG_NET_TEAM_MODE_ROUNDROBIN) += team_mode_roundrobin.o
> +obj-$(CONFIG_NET_TEAM_MODE_ACTIVEBACKUP) += team_mode_activebackup.o
> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> new file mode 100644
> index 0000000..398be58
> --- /dev/null
> +++ b/drivers/net/team/team.c
> @@ -0,0 +1,1593 @@
> +/*
> + * net/drivers/team/team.c - Network team device driver
> + * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/rcupdate.h>
> +#include <linux/errno.h>
> +#include <linux/ctype.h>
> +#include <linux/notifier.h>
> +#include <linux/netdevice.h>
> +#include <linux/if_arp.h>
> +#include <linux/socket.h>
> +#include <linux/etherdevice.h>
> +#include <linux/rtnetlink.h>
> +#include <net/rtnetlink.h>
> +#include <net/genetlink.h>
> +#include <net/netlink.h>
> +#include <linux/if_team.h>
> +
> +#define DRV_NAME "team"
> +
> +
> +/**********
> + * Helpers
> + **********/
> +
> +#define team_port_exists(dev) (dev->priv_flags & IFF_TEAM_PORT)
> +
> +static struct team_port *team_port_get_rcu(const struct net_device *dev)
> +{
> +       struct team_port *port = rcu_dereference(dev->rx_handler_data);
> +
> +       return team_port_exists(dev) ? port : NULL;
> +}
> +
> +static struct team_port *team_port_get_rtnl(const struct net_device *dev)
> +{
> +       struct team_port *port = rtnl_dereference(dev->rx_handler_data);
> +
> +       return team_port_exists(dev) ? port : NULL;
> +}
> +
> +/*
> + * Since the ability to change mac address for open port device is tested in
> + * team_port_add, this function can be called without control of return value
> + */
> +static int __set_port_mac(struct net_device *port_dev,
> +                         const unsigned char *dev_addr)
> +{
> +       struct sockaddr addr;
> +
> +       memcpy(addr.sa_data, dev_addr, ETH_ALEN);
> +       addr.sa_family = ARPHRD_ETHER;
> +       return dev_set_mac_address(port_dev, &addr);
> +}
> +
> +int team_port_set_orig_mac(struct team_port *port)
> +{
> +       return __set_port_mac(port->dev, port->orig.dev_addr);
> +}
> +EXPORT_SYMBOL(team_port_set_orig_mac);
> +
> +int team_port_set_team_mac(struct team_port *port)
> +{
> +       return __set_port_mac(port->dev, port->team->dev->dev_addr);
> +}
> +EXPORT_SYMBOL(team_port_set_team_mac);
> +
> +
> +/*******************
> + * Options handling
> + *******************/
> +
> +void team_options_register(struct team *team, struct team_option *option,
> +                          size_t option_count)
> +{
> +       int i;
> +
> +       for (i = 0; i < option_count; i++, option++)
> +               list_add_tail(&option->list, &team->option_list);
> +}
> +EXPORT_SYMBOL(team_options_register);
> +
> +static void __team_options_change_check(struct team *team,
> +                                       struct team_option *changed_option);
> +
> +static void __team_options_unregister(struct team *team,
> +                                     struct team_option *option,
> +                                     size_t option_count)
> +{
> +       int i;
> +
> +       for (i = 0; i < option_count; i++, option++)
> +               list_del(&option->list);
> +}
> +
> +void team_options_unregister(struct team *team, struct team_option *option,
> +                            size_t option_count)
> +{
> +       __team_options_unregister(team, option, option_count);
> +       __team_options_change_check(team, NULL);
> +}
> +EXPORT_SYMBOL(team_options_unregister);
> +
> +static int team_option_get(struct team *team, struct team_option *option,
> +                          void *arg)
> +{
> +       return option->getter(team, arg);
> +}
> +
> +static int team_option_set(struct team *team, struct team_option *option,
> +                          void *arg)
> +{
> +       int err;
> +
> +       err = option->setter(team, arg);
> +       if (err)
> +               return err;
> +
> +       __team_options_change_check(team, option);
> +       return err;
> +}
> +
> +/****************
> + * Mode handling
> + ****************/
> +
> +static LIST_HEAD(mode_list);
> +static DEFINE_SPINLOCK(mode_list_lock);
> +
> +static struct team_mode *__find_mode(const char *kind)
> +{
> +       struct team_mode *mode;
> +
> +       list_for_each_entry(mode, &mode_list, list) {
> +               if (strcmp(mode->kind, kind) == 0)
> +                       return mode;
> +       }
> +       return NULL;
> +}
> +
> +static bool is_good_mode_name(const char *name)
> +{
> +       while (*name != '\0') {
> +               if (!isalpha(*name) && !isdigit(*name) && *name != '_')
> +                       return false;
> +               name++;
> +       }
> +       return true;
> +}
> +
> +int team_mode_register(struct team_mode *mode)
> +{
> +       int err = 0;
> +
> +       if (!is_good_mode_name(mode->kind) ||
> +           mode->priv_size > TEAM_MODE_PRIV_SIZE)
> +               return -EINVAL;
> +       spin_lock(&mode_list_lock);
> +       if (__find_mode(mode->kind)) {
> +               err = -EEXIST;
> +               goto unlock;
> +       }
> +       list_add_tail(&mode->list, &mode_list);
> +unlock:
> +       spin_unlock(&mode_list_lock);
> +       return err;
> +}
> +EXPORT_SYMBOL(team_mode_register);
> +
> +int team_mode_unregister(struct team_mode *mode)
> +{
> +       spin_lock(&mode_list_lock);
> +       list_del_init(&mode->list);
> +       spin_unlock(&mode_list_lock);
> +       return 0;
> +}
> +EXPORT_SYMBOL(team_mode_unregister);
> +
> +static struct team_mode *team_mode_get(const char *kind)
> +{
> +       struct team_mode *mode;
> +
> +       spin_lock(&mode_list_lock);
> +       mode = __find_mode(kind);
> +       if (!mode) {
> +               spin_unlock(&mode_list_lock);
> +               request_module("team-mode-%s", kind);
> +               spin_lock(&mode_list_lock);
> +               mode = __find_mode(kind);
> +       }
> +       if (mode)
> +               if (!try_module_get(mode->owner))
> +                       mode = NULL;
> +
> +       spin_unlock(&mode_list_lock);
> +       return mode;
> +}
> +
> +static void team_mode_put(const char *kind)
> +{
> +       struct team_mode *mode;
> +
> +       spin_lock(&mode_list_lock);
> +       mode = __find_mode(kind);
> +       BUG_ON(!mode);
> +       module_put(mode->owner);
> +       spin_unlock(&mode_list_lock);
> +}
> +
> +/*
> + * We can benefit from the fact that it's ensured no port is present
> + * at the time of mode change.
> + */
> +static int __team_change_mode(struct team *team,
> +                             const struct team_mode *new_mode)
> +{
> +       /* Check if mode was previously set and do cleanup if so */
> +       if (team->mode_kind) {
> +               void (*exit_op)(struct team *team) = team->mode_ops.exit;
> +
> +               /* Clear ops area so no callback is called any longer */
> +               memset(&team->mode_ops, 0, sizeof(struct team_mode_ops));
> +
> +               synchronize_rcu();
> +
> +               if (exit_op)
> +                       exit_op(team);
> +               team_mode_put(team->mode_kind);
> +               team->mode_kind = NULL;
> +               /* zero private data area */
> +               memset(&team->mode_priv, 0,
> +                      sizeof(struct team) - offsetof(struct team, mode_priv));
> +       }
> +
> +       if (!new_mode)
> +               return 0;
> +
> +       if (new_mode->ops->init) {
> +               int err;
> +
> +               err = new_mode->ops->init(team);
> +               if (err)
> +                       return err;
> +       }
> +
> +       team->mode_kind = new_mode->kind;
> +       memcpy(&team->mode_ops, new_mode->ops, sizeof(struct team_mode_ops));
> +
> +       return 0;
> +}
> +
> +static int team_change_mode(struct team *team, const char *kind)
> +{
> +       struct team_mode *new_mode;
> +       struct net_device *dev = team->dev;
> +       int err;
> +
> +       if (!list_empty(&team->port_list)) {
> +               netdev_err(dev, "No ports can be present during mode change\n");
> +               return -EBUSY;
> +       }
> +
> +       if (team->mode_kind && strcmp(team->mode_kind, kind) == 0) {
> +               netdev_err(dev, "Unable to change to the same mode the team is in\n");
> +               return -EINVAL;
> +       }
> +
> +       new_mode = team_mode_get(kind);
> +       if (!new_mode) {
> +               netdev_err(dev, "Mode \"%s\" not found\n", kind);
> +               return -EINVAL;
> +       }
> +
> +       err = __team_change_mode(team, new_mode);
> +       if (err) {
> +               netdev_err(dev, "Failed to change to mode \"%s\"\n", kind);
> +               team_mode_put(kind);
> +               return err;
> +       }
> +
> +       netdev_info(dev, "Mode changed to \"%s\"\n", kind);
> +       return 0;
> +}
> +
> +
> +/************************
> + * Rx path frame handler
> + ************************/
> +
> +/* note: already called with rcu_read_lock */
> +static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
> +{
> +       struct sk_buff *skb = *pskb;
> +       struct team_port *port;
> +       struct team *team;
> +       rx_handler_result_t res = RX_HANDLER_ANOTHER;
> +
> +       skb = skb_share_check(skb, GFP_ATOMIC);
> +       if (!skb)
> +               return RX_HANDLER_CONSUMED;
> +
> +       *pskb = skb;
> +
> +       port = team_port_get_rcu(skb->dev);
> +       team = port->team;
> +
> +       if (team->mode_ops.receive)
> +               res = team->mode_ops.receive(team, port, skb);
> +
> +       if (res == RX_HANDLER_ANOTHER) {
> +               struct team_pcpu_stats *pcpu_stats;
> +
> +               pcpu_stats = this_cpu_ptr(team->pcpu_stats);
> +               u64_stats_update_begin(&pcpu_stats->syncp);
> +               pcpu_stats->rx_packets++;
> +               pcpu_stats->rx_bytes += skb->len;
> +               if (skb->pkt_type == PACKET_MULTICAST)
> +                       pcpu_stats->rx_multicast++;
> +               u64_stats_update_end(&pcpu_stats->syncp);
> +
> +               skb->dev = team->dev;
> +       } else {
> +               this_cpu_inc(team->pcpu_stats->rx_dropped);
> +       }
> +
> +       return res;
> +}
> +
> +
> +/****************
> + * Port handling
> + ****************/
> +
> +static bool team_port_find(const struct team *team,
> +                          const struct team_port *port)
> +{
> +       struct team_port *cur;
> +
> +       list_for_each_entry(cur, &team->port_list, list)
> +               if (cur == port)
> +                       return true;
> +       return false;
> +}
> +
> +static int team_port_list_init(struct team *team)
> +{
> +       int i;
> +       struct hlist_head *hash;
> +
> +       hash = kmalloc(sizeof(*hash) * TEAM_PORT_HASHENTRIES, GFP_KERNEL);
> +       if (!hash)
> +               return -ENOMEM;
> +
> +       for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
> +               INIT_HLIST_HEAD(&hash[i]);
> +       team->port_hlist = hash;
> +       INIT_LIST_HEAD(&team->port_list);
> +       return 0;
> +}
> +
> +static void team_port_list_fini(struct team *team)
> +{
> +       kfree(team->port_hlist);
> +}
> +
> +/*
> + * Add/delete port to the team port list. Write guarded by rtnl_lock.
> + * Takes care of correct port->index setup (might be racy).
> + */
> +static void team_port_list_add_port(struct team *team,
> +                                   struct team_port *port)
> +{
> +       port->index = team->port_count++;
> +       hlist_add_head_rcu(&port->hlist,
> +                          team_port_index_hash(team, port->index));
> +       list_add_tail_rcu(&port->list, &team->port_list);
> +}
> +
> +static void __reconstruct_port_hlist(struct team *team, int rm_index)
> +{
> +       int i;
> +       struct team_port *port;
> +
> +       for (i = rm_index + 1; i < team->port_count; i++) {
> +               port = team_get_port_by_index_rcu(team, i);
> +               hlist_del_rcu(&port->hlist);
> +               port->index--;
> +               hlist_add_head_rcu(&port->hlist,
> +                                  team_port_index_hash(team, port->index));
> +       }
> +}
> +
> +static void team_port_list_del_port(struct team *team,
> +                                  struct team_port *port)
> +{
> +       int rm_index = port->index;
> +
> +       hlist_del_rcu(&port->hlist);
> +       list_del_rcu(&port->list);
> +       __reconstruct_port_hlist(team, rm_index);
> +       team->port_count--;
> +}
> +
> +#define TEAM_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
> +                           NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
> +                           NETIF_F_HIGHDMA | NETIF_F_LRO)
> +
> +static void __team_compute_features(struct team *team)
> +{
> +       struct team_port *port;
> +       u32 vlan_features = TEAM_VLAN_FEATURES;
> +       unsigned short max_hard_header_len = ETH_HLEN;
> +
> +       list_for_each_entry(port, &team->port_list, list) {
> +               vlan_features = netdev_increment_features(vlan_features,
> +                                       port->dev->vlan_features,
> +                                       TEAM_VLAN_FEATURES);
> +
> +               if (port->dev->hard_header_len > max_hard_header_len)
> +                       max_hard_header_len = port->dev->hard_header_len;
> +       }
> +
> +       team->dev->vlan_features = vlan_features;
> +       team->dev->hard_header_len = max_hard_header_len;
> +
> +       netdev_change_features(team->dev);
> +}
> +
> +static void team_compute_features(struct team *team)
> +{
> +       spin_lock(&team->lock);
> +       __team_compute_features(team);
> +       spin_unlock(&team->lock);
> +}
> +
> +static int team_port_enter(struct team *team, struct team_port *port)
> +{
> +       int err = 0;
> +
> +       dev_hold(team->dev);
> +       port->dev->priv_flags |= IFF_TEAM_PORT;
> +       if (team->mode_ops.port_enter) {
> +               err = team->mode_ops.port_enter(team, port);
> +               if (err)
> +                       netdev_err(team->dev, "Device %s failed to enter team mode\n",
> +                                  port->dev->name);
> +       }
> +       return err;
> +}
> +
> +static void team_port_leave(struct team *team, struct team_port *port)
> +{
> +       if (team->mode_ops.port_leave)
> +               team->mode_ops.port_leave(team, port);
> +       port->dev->priv_flags &= ~IFF_TEAM_PORT;
> +       dev_put(team->dev);
> +}
> +
> +static void __team_port_change_check(struct team_port *port, bool linkup);
> +
> +static int team_port_add(struct team *team, struct net_device *port_dev)
> +{
> +       struct net_device *dev = team->dev;
> +       struct team_port *port;
> +       char *portname = port_dev->name;
> +       char tmp_addr[ETH_ALEN];
> +       int err;
> +
> +       if (port_dev->flags & IFF_LOOPBACK ||
> +           port_dev->type != ARPHRD_ETHER) {
> +               netdev_err(dev, "Device %s is of an unsupported type\n",
> +                          portname);
> +               return -EINVAL;
> +       }
> +
> +       if (team_port_exists(port_dev)) {
> +               netdev_err(dev, "Device %s is already a port "
> +                               "of a team device\n", portname);
> +               return -EBUSY;
> +       }
> +
> +       if (port_dev->flags & IFF_UP) {
> +               netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
> +                          portname);
> +               return -EBUSY;
> +       }
> +
> +       port = kzalloc(sizeof(struct team_port), GFP_KERNEL);
> +       if (!port)
> +               return -ENOMEM;
> +
> +       port->dev = port_dev;
> +       port->team = team;
> +
> +       port->orig.mtu = port_dev->mtu;
> +       err = dev_set_mtu(port_dev, dev->mtu);
> +       if (err) {
> +               netdev_dbg(dev, "Error %d calling dev_set_mtu\n", err);
> +               goto err_set_mtu;
> +       }
> +
> +       memcpy(port->orig.dev_addr, port_dev->dev_addr, ETH_ALEN);
> +       random_ether_addr(tmp_addr);
> +       err = __set_port_mac(port_dev, tmp_addr);
> +       if (err) {
> +               netdev_dbg(dev, "Device %s mac addr set failed\n",
> +                          portname);
> +               goto err_set_mac_rand;
> +       }
> +
> +       err = dev_open(port_dev);
> +       if (err) {
> +               netdev_dbg(dev, "Device %s opening failed\n",
> +                          portname);
> +               goto err_dev_open;
> +       }
> +
> +       err = team_port_set_orig_mac(port);
> +       if (err) {
> +               netdev_dbg(dev, "Device %s mac addr set failed - Device does not support addr change when it's opened\n",
> +                          portname);
> +               goto err_set_mac_opened;
> +       }
> +
> +       err = team_port_enter(team, port);
> +       if (err) {
> +               netdev_err(dev, "Device %s failed to enter team mode\n",
> +                          portname);
> +               goto err_port_enter;
> +       }
> +
> +       err = netdev_set_master(port_dev, dev);
> +       if (err) {
> +               netdev_err(dev, "Device %s failed to set master\n", portname);
> +               goto err_set_master;
> +       }
> +
> +       err = netdev_rx_handler_register(port_dev, team_handle_frame,
> +                                        port);
> +       if (err) {
> +               netdev_err(dev, "Device %s failed to register rx_handler\n",
> +                          portname);
> +               goto err_handler_register;
> +       }
> +
> +       team_port_list_add_port(team, port);
> +       __team_compute_features(team);
> +       __team_port_change_check(port, !!netif_carrier_ok(port_dev));
> +
> +       netdev_info(dev, "Port device %s added\n", portname);
> +
> +       return 0;
> +
> +err_handler_register:
> +       netdev_set_master(port_dev, NULL);
> +
> +err_set_master:
> +       team_port_leave(team, port);
> +
> +err_port_enter:
> +err_set_mac_opened:
> +       dev_close(port_dev);
> +
> +err_dev_open:
> +       team_port_set_orig_mac(port);
> +
> +err_set_mac_rand:
> +       dev_set_mtu(port_dev, port->orig.mtu);
> +
> +err_set_mtu:
> +       kfree(port);
> +
> +       return err;
> +}
> +
> +static int team_port_del(struct team *team, struct net_device *port_dev)
> +{
> +       struct net_device *dev = team->dev;
> +       struct team_port *port;
> +       char *portname = port_dev->name;
> +
> +       port = team_port_get_rtnl(port_dev);
> +       if (!port || !team_port_find(team, port)) {
> +               netdev_err(dev, "Device %s does not act as a port of this team\n",
> +                          portname);
> +               return -ENOENT;
> +       }
> +
> +       __team_port_change_check(port, false);
> +       team_port_list_del_port(team, port);
> +       netdev_rx_handler_unregister(port_dev);
> +       netdev_set_master(port_dev, NULL);
> +       team_port_leave(team, port);
> +       dev_close(port_dev);
> +       team_port_set_orig_mac(port);
> +       dev_set_mtu(port_dev, port->orig.mtu);
> +       synchronize_rcu();
> +       kfree(port);
> +       netdev_info(dev, "Port device %s removed\n", portname);
> +       __team_compute_features(team);
> +
> +       return 0;
> +}
> +
> +
> +/*****************
> + * Net device ops
> + *****************/
> +
> +static const char team_no_mode_kind[] = "*NOMODE*";
> +
> +static int team_mode_option_get(struct team *team, void *arg)
> +{
> +       const char **str = arg;
> +
> +       *str = team->mode_kind ? team->mode_kind : team_no_mode_kind;
> +       return 0;
> +}
> +
> +static int team_mode_option_set(struct team *team, void *arg)
> +{
> +       const char **str = arg;
> +
> +       return team_change_mode(team, *str);
> +}
> +
> +static struct team_option team_options[] = {
> +       {
> +               .name = "mode",
> +               .type = TEAM_OPTION_TYPE_STRING,
> +               .getter = team_mode_option_get,
> +               .setter = team_mode_option_set,
> +       },
> +};
> +
> +static int team_init(struct net_device *dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       int err;
> +
> +       team->dev = dev;
> +       spin_lock_init(&team->lock);
> +
> +       team->pcpu_stats = alloc_percpu(struct team_pcpu_stats);
> +       if (!team->pcpu_stats)
> +               return -ENOMEM;
> +
> +       err = team_port_list_init(team);
> +       if (err)
> +               goto err_port_list_init;
> +
> +       INIT_LIST_HEAD(&team->option_list);
> +       team_options_register(team, team_options, ARRAY_SIZE(team_options));
> +       netif_carrier_off(dev);
> +
> +       return 0;
> +
> +err_port_list_init:
> +
> +       free_percpu(team->pcpu_stats);
> +
> +       return err;
> +}
> +
> +static void team_uninit(struct net_device *dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +       struct team_port *tmp;
> +
> +       spin_lock(&team->lock);
> +       list_for_each_entry_safe(port, tmp, &team->port_list, list)
> +               team_port_del(team, port->dev);
> +
> +       __team_change_mode(team, NULL); /* cleanup */
> +       __team_options_unregister(team, team_options, ARRAY_SIZE(team_options));
> +       spin_unlock(&team->lock);
> +}
> +
> +static void team_destructor(struct net_device *dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +
> +       team_port_list_fini(team);
> +       free_percpu(team->pcpu_stats);
> +       free_netdev(dev);
> +}
> +
> +static int team_open(struct net_device *dev)
> +{
> +       netif_carrier_on(dev);
> +       return 0;
> +}
> +
> +static int team_close(struct net_device *dev)
> +{
> +       netif_carrier_off(dev);
> +       return 0;
> +}
> +
> +/*
> + * note: already called with rcu_read_lock
> + */
> +static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       bool tx_success = false;
> +       unsigned int len = skb->len;
> +
> +       /*
> +        * Ensure transmit function is called only in case there is at least
> +        * one port present.
> +        */
> +       if (likely(!list_empty(&team->port_list) && team->mode_ops.transmit))
> +               tx_success = team->mode_ops.transmit(team, skb);
> +       if (tx_success) {
> +               struct team_pcpu_stats *pcpu_stats;
> +
> +               pcpu_stats = this_cpu_ptr(team->pcpu_stats);
> +               u64_stats_update_begin(&pcpu_stats->syncp);
> +               pcpu_stats->tx_packets++;
> +               pcpu_stats->tx_bytes += len;
> +               u64_stats_update_end(&pcpu_stats->syncp);
> +       } else {
> +               this_cpu_inc(team->pcpu_stats->tx_dropped);
> +       }
> +
> +       return NETDEV_TX_OK;
> +}
> +
> +static void team_change_rx_flags(struct net_device *dev, int change)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +       int inc;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               if (change & IFF_PROMISC) {
> +                       inc = dev->flags & IFF_PROMISC ? 1 : -1;
> +                       dev_set_promiscuity(port->dev, inc);
> +               }
> +               if (change & IFF_ALLMULTI) {
> +                       inc = dev->flags & IFF_ALLMULTI ? 1 : -1;
> +                       dev_set_allmulti(port->dev, inc);
> +               }
> +       }
> +       rcu_read_unlock();
> +}
> +
> +static void team_set_rx_mode(struct net_device *dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               dev_uc_sync(port->dev, dev);
> +               dev_mc_sync(port->dev, dev);
> +       }
> +       rcu_read_unlock();
> +}
> +
> +static int team_set_mac_address(struct net_device *dev, void *p)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +       struct sockaddr *addr = p;
> +
> +       memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list)
> +               if (team->mode_ops.port_change_mac)
> +                       team->mode_ops.port_change_mac(team, port);
> +       rcu_read_unlock();
> +       return 0;
> +}
> +
> +static int team_change_mtu(struct net_device *dev, int new_mtu)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +       int err;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               err = dev_set_mtu(port->dev, new_mtu);
> +               if (err) {
> +                       netdev_err(dev, "Device %s failed to change mtu",
> +                                  port->dev->name);
> +                       goto unwind;
> +               }
> +       }
> +       rcu_read_unlock();
> +
> +       dev->mtu = new_mtu;
> +
> +       return 0;
> +
> +unwind:
> +       list_for_each_entry_continue_reverse(port, &team->port_list, list)
> +               dev_set_mtu(port->dev, dev->mtu);
> +
> +       rcu_read_unlock();
> +       return err;
> +}
> +
> +static struct rtnl_link_stats64 *
> +team_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_pcpu_stats *p;
> +       u64 rx_packets, rx_bytes, rx_multicast, tx_packets, tx_bytes;
> +       u32 rx_dropped = 0, tx_dropped = 0;
> +       unsigned int start;
> +       int i;
> +
> +       for_each_possible_cpu(i) {
> +               p = per_cpu_ptr(team->pcpu_stats, i);
> +               do {
> +                       start = u64_stats_fetch_begin_bh(&p->syncp);
> +                       rx_packets      = p->rx_packets;
> +                       rx_bytes        = p->rx_bytes;
> +                       rx_multicast    = p->rx_multicast;
> +                       tx_packets      = p->tx_packets;
> +                       tx_bytes        = p->tx_bytes;
> +               } while (u64_stats_fetch_retry_bh(&p->syncp, start));
> +
> +               stats->rx_packets       += rx_packets;
> +               stats->rx_bytes         += rx_bytes;
> +               stats->multicast        += rx_multicast;
> +               stats->tx_packets       += tx_packets;
> +               stats->tx_bytes         += tx_bytes;
> +               /*
> +                * rx_dropped & tx_dropped are u32, updated
> +                * without syncp protection.
> +                */
> +               rx_dropped      += p->rx_dropped;
> +               tx_dropped      += p->tx_dropped;
> +       }
> +       stats->rx_dropped       = rx_dropped;
> +       stats->tx_dropped       = tx_dropped;
> +       return stats;
> +}
> +
> +static void team_vlan_rx_add_vid(struct net_device *dev, uint16_t vid)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               const struct net_device_ops *ops = port->dev->netdev_ops;
> +
> +               ops->ndo_vlan_rx_add_vid(port->dev, vid);
> +       }
> +       rcu_read_unlock();
> +}
> +
> +static void team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
> +{
> +       struct team *team = netdev_priv(dev);
> +       struct team_port *port;
> +
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               const struct net_device_ops *ops = port->dev->netdev_ops;
> +
> +               ops->ndo_vlan_rx_kill_vid(port->dev, vid);
> +       }
> +       rcu_read_unlock();
> +}
> +
> +static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       int err;
> +
> +       spin_lock(&team->lock);
> +       err = team_port_add(team, port_dev);
> +       spin_unlock(&team->lock);
> +       return err;
> +}
> +
> +static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
> +{
> +       struct team *team = netdev_priv(dev);
> +       int err;
> +
> +       spin_lock(&team->lock);
> +       err = team_port_del(team, port_dev);
> +       spin_unlock(&team->lock);
> +       return err;
> +}
> +
> +static const struct net_device_ops team_netdev_ops = {
> +       .ndo_init               = team_init,
> +       .ndo_uninit             = team_uninit,
> +       .ndo_open               = team_open,
> +       .ndo_stop               = team_close,
> +       .ndo_start_xmit         = team_xmit,
> +       .ndo_change_rx_flags    = team_change_rx_flags,
> +       .ndo_set_rx_mode        = team_set_rx_mode,
> +       .ndo_set_mac_address    = team_set_mac_address,
> +       .ndo_change_mtu         = team_change_mtu,
> +       .ndo_get_stats64        = team_get_stats64,
> +       .ndo_vlan_rx_add_vid    = team_vlan_rx_add_vid,
> +       .ndo_vlan_rx_kill_vid   = team_vlan_rx_kill_vid,
> +       .ndo_add_slave          = team_add_slave,
> +       .ndo_del_slave          = team_del_slave,
> +};
> +
> +
> +/***********************
> + * rt netlink interface
> + ***********************/
> +
> +static void team_setup(struct net_device *dev)
> +{
> +       ether_setup(dev);
> +
> +       dev->netdev_ops = &team_netdev_ops;
> +       dev->destructor = team_destructor;
> +       dev->tx_queue_len = 0;
> +       dev->flags |= IFF_MULTICAST;
> +       dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
> +
> +       /*
> +        * Indicate we support unicast address filtering. That way core won't
> +        * bring us to promisc mode in case a unicast addr is added.
> +        * Let this up to underlay drivers.
> +        */
> +       dev->priv_flags |= IFF_UNICAST_FLT;
> +
> +       dev->features |= NETIF_F_LLTX;
> +       dev->features |= NETIF_F_GRO;
> +       dev->hw_features = NETIF_F_HW_VLAN_TX |
> +                          NETIF_F_HW_VLAN_RX |
> +                          NETIF_F_HW_VLAN_FILTER;
> +
> +       dev->features |= dev->hw_features;
> +}
> +
> +static int team_newlink(struct net *src_net, struct net_device *dev,
> +                       struct nlattr *tb[], struct nlattr *data[])
> +{
> +       int err;
> +
> +       if (tb[IFLA_ADDRESS] == NULL)
> +               random_ether_addr(dev->dev_addr);
> +
> +       err = register_netdevice(dev);
> +       if (err)
> +               return err;
> +
> +       return 0;
> +}
> +
> +static int team_validate(struct nlattr *tb[], struct nlattr *data[])
> +{
> +       if (tb[IFLA_ADDRESS]) {
> +               if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
> +                       return -EINVAL;
> +               if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
> +                       return -EADDRNOTAVAIL;
> +       }
> +       return 0;
> +}
> +
> +static struct rtnl_link_ops team_link_ops __read_mostly = {
> +       .kind           = DRV_NAME,
> +       .priv_size      = sizeof(struct team),
> +       .setup          = team_setup,
> +       .newlink        = team_newlink,
> +       .validate       = team_validate,
> +};
> +
> +
> +/***********************************
> + * Generic netlink custom interface
> + ***********************************/
> +
> +static struct genl_family team_nl_family = {
> +       .id             = GENL_ID_GENERATE,
> +       .name           = TEAM_GENL_NAME,
> +       .version        = TEAM_GENL_VERSION,
> +       .maxattr        = TEAM_ATTR_MAX,
> +       .netnsok        = true,
> +};
> +
> +static const struct nla_policy team_nl_policy[TEAM_ATTR_MAX + 1] = {
> +       [TEAM_ATTR_UNSPEC]                      = { .type = NLA_UNSPEC, },
> +       [TEAM_ATTR_TEAM_IFINDEX]                = { .type = NLA_U32 },
> +       [TEAM_ATTR_LIST_OPTION]                 = { .type = NLA_NESTED },
> +       [TEAM_ATTR_LIST_PORT]                   = { .type = NLA_NESTED },
> +};
> +
> +static const struct nla_policy
> +team_nl_option_policy[TEAM_ATTR_OPTION_MAX + 1] = {
> +       [TEAM_ATTR_OPTION_UNSPEC]               = { .type = NLA_UNSPEC, },
> +       [TEAM_ATTR_OPTION_NAME] = {
> +               .type = NLA_STRING,
> +               .len = TEAM_STRING_MAX_LEN,
> +       },
> +       [TEAM_ATTR_OPTION_CHANGED]              = { .type = NLA_FLAG },
> +       [TEAM_ATTR_OPTION_TYPE]                 = { .type = NLA_U8 },
> +       [TEAM_ATTR_OPTION_DATA] = {
> +               .type = NLA_BINARY,
> +               .len = TEAM_STRING_MAX_LEN,
> +       },
> +};
> +
> +static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
> +{
> +       struct sk_buff *msg;
> +       void *hdr;
> +       int err;
> +
> +       msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +       if (!msg)
> +               return -ENOMEM;
> +
> +       hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
> +                         &team_nl_family, 0, TEAM_CMD_NOOP);
> +       if (IS_ERR(hdr)) {
> +               err = PTR_ERR(hdr);
> +               goto err_msg_put;
> +       }
> +
> +       genlmsg_end(msg, hdr);
> +
> +       return genlmsg_unicast(genl_info_net(info), msg, info->snd_pid);
> +
> +err_msg_put:
> +       nlmsg_free(msg);
> +
> +       return err;
> +}
> +
> +/*
> + * Netlink cmd functions should be locked by following two functions.
> + * To ensure team_uninit would not be called in between, hold rcu_read_lock
> + * all the time.
> + */
> +static struct team *team_nl_team_get(struct genl_info *info)
> +{
> +       struct net *net = genl_info_net(info);
> +       int ifindex;
> +       struct net_device *dev;
> +       struct team *team;
> +
> +       if (!info->attrs[TEAM_ATTR_TEAM_IFINDEX])
> +               return NULL;
> +
> +       ifindex = nla_get_u32(info->attrs[TEAM_ATTR_TEAM_IFINDEX]);
> +       rcu_read_lock();
> +       dev = dev_get_by_index_rcu(net, ifindex);
> +       if (!dev || dev->netdev_ops != &team_netdev_ops) {
> +               rcu_read_unlock();
> +               return NULL;
> +       }
> +
> +       team = netdev_priv(dev);
> +       spin_lock(&team->lock);
> +       return team;
> +}
> +
> +static void team_nl_team_put(struct team *team)
> +{
> +       spin_unlock(&team->lock);
> +       rcu_read_unlock();
> +}
> +
> +static int team_nl_send_generic(struct genl_info *info, struct team *team,
> +                               int (*fill_func)(struct sk_buff *skb,
> +                                                struct genl_info *info,
> +                                                int flags, struct team *team))
> +{
> +       struct sk_buff *skb;
> +       int err;
> +
> +       skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +       if (!skb)
> +               return -ENOMEM;
> +
> +       err = fill_func(skb, info, NLM_F_ACK, team);
> +       if (err < 0)
> +               goto err_fill;
> +
> +       err = genlmsg_unicast(genl_info_net(info), skb, info->snd_pid);
> +       return err;
> +
> +err_fill:
> +       nlmsg_free(skb);
> +       return err;
> +}
> +
> +static int team_nl_fill_options_get_changed(struct sk_buff *skb,
> +                                           u32 pid, u32 seq, int flags,
> +                                           struct team *team,
> +                                           struct team_option *changed_option)
> +{
> +       struct nlattr *option_list;
> +       void *hdr;
> +       struct team_option *option;
> +
> +       hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
> +                         TEAM_CMD_OPTIONS_GET);
> +       if (IS_ERR(hdr))
> +               return PTR_ERR(hdr);
> +
> +       NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
> +       option_list = nla_nest_start(skb, TEAM_ATTR_LIST_OPTION);
> +       if (!option_list)
> +               return -EMSGSIZE;
> +
> +       list_for_each_entry(option, &team->option_list, list) {
> +               struct nlattr *option_item;
> +               long arg;
> +
> +               option_item = nla_nest_start(skb, TEAM_ATTR_ITEM_OPTION);
> +               if (!option_item)
> +                       goto nla_put_failure;
> +               NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_NAME, option->name);
> +               if (option == changed_option)
> +                       NLA_PUT_FLAG(skb, TEAM_ATTR_OPTION_CHANGED);
> +               switch (option->type) {
> +               case TEAM_OPTION_TYPE_U32:
> +                       NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_U32);
> +                       team_option_get(team, option, &arg);
> +                       NLA_PUT_U32(skb, TEAM_ATTR_OPTION_DATA, arg);
> +                       break;
> +               case TEAM_OPTION_TYPE_STRING:
> +                       NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_STRING);
> +                       team_option_get(team, option, &arg);
> +                       NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_DATA,
> +                                      (char *) arg);
> +                       break;
> +               default:
> +                       BUG();
> +               }
> +               nla_nest_end(skb, option_item);
> +       }
> +
> +       nla_nest_end(skb, option_list);
> +       return genlmsg_end(skb, hdr);
> +
> +nla_put_failure:
> +       genlmsg_cancel(skb, hdr);
> +       return -EMSGSIZE;
> +}
> +
> +static int team_nl_fill_options_get(struct sk_buff *skb,
> +                                   struct genl_info *info, int flags,
> +                                   struct team *team)
> +{
> +       return team_nl_fill_options_get_changed(skb, info->snd_pid,
> +                                               info->snd_seq, NLM_F_ACK,
> +                                               team, NULL);
> +}
> +
> +static int team_nl_cmd_options_get(struct sk_buff *skb, struct genl_info *info)
> +{
> +       struct team *team;
> +       int err;
> +
> +       team = team_nl_team_get(info);
> +       if (!team)
> +               return -EINVAL;
> +
> +       err = team_nl_send_generic(info, team, team_nl_fill_options_get);
> +
> +       team_nl_team_put(team);
> +
> +       return err;
> +}
> +
> +static int team_nl_cmd_options_set(struct sk_buff *skb, struct genl_info *info)
> +{
> +       struct team *team;
> +       int err = 0;
> +       int i;
> +       struct nlattr *nl_option;
> +
> +       team = team_nl_team_get(info);
> +       if (!team)
> +               return -EINVAL;
> +
> +       err = -EINVAL;
> +       if (!info->attrs[TEAM_ATTR_LIST_OPTION]) {
> +               err = -EINVAL;
> +               goto team_put;
> +       }
> +
> +       nla_for_each_nested(nl_option, info->attrs[TEAM_ATTR_LIST_OPTION], i) {
> +               struct nlattr *mode_attrs[TEAM_ATTR_OPTION_MAX + 1];
> +               enum team_option_type opt_type;
> +               struct team_option *option;
> +               char *opt_name;
> +               bool opt_found = false;
> +
> +               if (nla_type(nl_option) != TEAM_ATTR_ITEM_OPTION) {
> +                       err = -EINVAL;
> +                       goto team_put;
> +               }
> +               err = nla_parse_nested(mode_attrs, TEAM_ATTR_OPTION_MAX,
> +                                      nl_option, team_nl_option_policy);
> +               if (err)
> +                       goto team_put;
> +               if (!mode_attrs[TEAM_ATTR_OPTION_NAME] ||
> +                   !mode_attrs[TEAM_ATTR_OPTION_TYPE] ||
> +                   !mode_attrs[TEAM_ATTR_OPTION_DATA]) {
> +                       err = -EINVAL;
> +                       goto team_put;
> +               }
> +               switch (nla_get_u8(mode_attrs[TEAM_ATTR_OPTION_TYPE])) {
> +               case NLA_U32:
> +                       opt_type = TEAM_OPTION_TYPE_U32;
> +                       break;
> +               case NLA_STRING:
> +                       opt_type = TEAM_OPTION_TYPE_STRING;
> +                       break;
> +               default:
> +                       goto team_put;
> +               }
> +
> +               opt_name = nla_data(mode_attrs[TEAM_ATTR_OPTION_NAME]);
> +               list_for_each_entry(option, &team->option_list, list) {
> +                       long arg;
> +                       struct nlattr *opt_data_attr;
> +
> +                       if (option->type != opt_type ||
> +                           strcmp(option->name, opt_name))
> +                               continue;
> +                       opt_found = true;
> +                       opt_data_attr = mode_attrs[TEAM_ATTR_OPTION_DATA];
> +                       switch (opt_type) {
> +                       case TEAM_OPTION_TYPE_U32:
> +                               arg = nla_get_u32(opt_data_attr);
> +                               break;
> +                       case TEAM_OPTION_TYPE_STRING:
> +                               arg = (long) nla_data(opt_data_attr);
> +                               break;
> +                       default:
> +                               BUG();
> +                       }
> +                       err = team_option_set(team, option, &arg);
> +                       if (err)
> +                               goto team_put;
> +               }
> +               if (!opt_found) {
> +                       err = -ENOENT;
> +                       goto team_put;
> +               }
> +       }
> +
> +team_put:
> +       team_nl_team_put(team);
> +
> +       return err;
> +}
> +
> +static int team_nl_fill_port_list_get_changed(struct sk_buff *skb,
> +                                             u32 pid, u32 seq, int flags,
> +                                             struct team *team,
> +                                             struct team_port *changed_port)
> +{
> +       struct nlattr *port_list;
> +       void *hdr;
> +       struct team_port *port;
> +
> +       hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
> +                         TEAM_CMD_PORT_LIST_GET);
> +       if (IS_ERR(hdr))
> +               return PTR_ERR(hdr);
> +
> +       NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
> +       port_list = nla_nest_start(skb, TEAM_ATTR_LIST_PORT);
> +       if (!port_list)
> +               return -EMSGSIZE;
> +
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               struct nlattr *port_item;
> +
> +               port_item = nla_nest_start(skb, TEAM_ATTR_ITEM_PORT);
> +               if (!port_item)
> +                       goto nla_put_failure;
> +               NLA_PUT_U32(skb, TEAM_ATTR_PORT_IFINDEX, port->dev->ifindex);
> +               if (port == changed_port)
> +                       NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_CHANGED);
> +               if (port->linkup)
> +                       NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_LINKUP);
> +               NLA_PUT_U32(skb, TEAM_ATTR_PORT_SPEED, port->speed);
> +               NLA_PUT_U8(skb, TEAM_ATTR_PORT_DUPLEX, port->duplex);
> +               nla_nest_end(skb, port_item);
> +       }
> +
> +       nla_nest_end(skb, port_list);
> +       return genlmsg_end(skb, hdr);
> +
> +nla_put_failure:
> +       genlmsg_cancel(skb, hdr);
> +       return -EMSGSIZE;
> +}
> +
> +static int team_nl_fill_port_list_get(struct sk_buff *skb,
> +                                     struct genl_info *info, int flags,
> +                                     struct team *team)
> +{
> +       return team_nl_fill_port_list_get_changed(skb, info->snd_pid,
> +                                                 info->snd_seq, NLM_F_ACK,
> +                                                 team, NULL);
> +}
> +
> +static int team_nl_cmd_port_list_get(struct sk_buff *skb,
> +                                    struct genl_info *info)
> +{
> +       struct team *team;
> +       int err;
> +
> +       team = team_nl_team_get(info);
> +       if (!team)
> +               return -EINVAL;
> +
> +       err = team_nl_send_generic(info, team, team_nl_fill_port_list_get);
> +
> +       team_nl_team_put(team);
> +
> +       return err;
> +}
> +
> +static struct genl_ops team_nl_ops[] = {
> +       {
> +               .cmd = TEAM_CMD_NOOP,
> +               .doit = team_nl_cmd_noop,
> +               .policy = team_nl_policy,
> +       },
> +       {
> +               .cmd = TEAM_CMD_OPTIONS_SET,
> +               .doit = team_nl_cmd_options_set,
> +               .policy = team_nl_policy,
> +               .flags = GENL_ADMIN_PERM,
> +       },
> +       {
> +               .cmd = TEAM_CMD_OPTIONS_GET,
> +               .doit = team_nl_cmd_options_get,
> +               .policy = team_nl_policy,
> +               .flags = GENL_ADMIN_PERM,
> +       },
> +       {
> +               .cmd = TEAM_CMD_PORT_LIST_GET,
> +               .doit = team_nl_cmd_port_list_get,
> +               .policy = team_nl_policy,
> +               .flags = GENL_ADMIN_PERM,
> +       },
> +};
> +
> +static struct genl_multicast_group team_change_event_mcgrp = {
> +       .name = TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME,
> +};
> +
> +static int team_nl_send_event_options_get(struct team *team,
> +                                         struct team_option *changed_option)
> +{
> +       struct sk_buff *skb;
> +       int err;
> +       struct net *net = dev_net(team->dev);
> +
> +       skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +       if (!skb)
> +               return -ENOMEM;
> +
> +       err = team_nl_fill_options_get_changed(skb, 0, 0, 0, team,
> +                                              changed_option);
> +       if (err < 0)
> +               goto err_fill;
> +
> +       err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
> +                                     GFP_KERNEL);
> +       return err;
> +
> +err_fill:
> +       nlmsg_free(skb);
> +       return err;
> +}
> +
> +static int team_nl_send_event_port_list_get(struct team_port *port)
> +{
> +       struct sk_buff *skb;
> +       int err;
> +       struct net *net = dev_net(port->team->dev);
> +
> +       skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
> +       if (!skb)
> +               return -ENOMEM;
> +
> +       err = team_nl_fill_port_list_get_changed(skb, 0, 0, 0,
> +                                                port->team, port);
> +       if (err < 0)
> +               goto err_fill;
> +
> +       err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
> +                                     GFP_KERNEL);
> +       return err;
> +
> +err_fill:
> +       nlmsg_free(skb);
> +       return err;
> +}
> +
> +static int team_nl_init(void)
> +{
> +       int err;
> +
> +       err = genl_register_family_with_ops(&team_nl_family, team_nl_ops,
> +                                           ARRAY_SIZE(team_nl_ops));
> +       if (err)
> +               return err;
> +
> +       err = genl_register_mc_group(&team_nl_family, &team_change_event_mcgrp);
> +       if (err)
> +               goto err_change_event_grp_reg;
> +
> +       return 0;
> +
> +err_change_event_grp_reg:
> +       genl_unregister_family(&team_nl_family);
> +
> +       return err;
> +}
> +
> +static void team_nl_fini(void)
> +{
> +       genl_unregister_family(&team_nl_family);
> +}
> +
> +
> +/******************
> + * Change checkers
> + ******************/
> +
> +static void __team_options_change_check(struct team *team,
> +                                       struct team_option *changed_option)
> +{
> +       int err;
> +
> +       err = team_nl_send_event_options_get(team, changed_option);
> +       if (err)
> +               netdev_warn(team->dev, "Failed to send options change via netlink\n");
> +}
> +
> +/* rtnl lock is held */
> +static void __team_port_change_check(struct team_port *port, bool linkup)
> +{
> +       int err;
> +
> +       if (port->linkup == linkup)
> +               return;
> +
> +       port->linkup = linkup;
> +       if (linkup) {
> +               struct ethtool_cmd ecmd;
> +
> +               err = __ethtool_get_settings(port->dev, &ecmd);
> +               if (!err) {
> +                       port->speed = ethtool_cmd_speed(&ecmd);
> +                       port->duplex = ecmd.duplex;
> +                       goto send_event;
> +               }
> +       }
> +       port->speed = 0;
> +       port->duplex = 0;
> +
> +send_event:
> +       err = team_nl_send_event_port_list_get(port);
> +       if (err)
> +               netdev_warn(port->team->dev, "Failed to send port change of device %s via netlink\n",
> +                           port->dev->name);
> +
> +}
> +
> +static void team_port_change_check(struct team_port *port, bool linkup)
> +{
> +       struct team *team = port->team;
> +
> +       spin_lock(&team->lock);
> +       __team_port_change_check(port, linkup);
> +       spin_unlock(&team->lock);
> +}
> +
> +/************************************
> + * Net device notifier event handler
> + ************************************/
> +
> +static int team_device_event(struct notifier_block *unused,
> +                            unsigned long event, void *ptr)
> +{
> +       struct net_device *dev = (struct net_device *) ptr;
> +       struct team_port *port;
> +
> +       port = team_port_get_rtnl(dev);
> +       if (!port)
> +               return NOTIFY_DONE;
> +
> +       switch (event) {
> +       case NETDEV_UP:
> +               if (netif_carrier_ok(dev))
> +                       team_port_change_check(port, true);
> +       case NETDEV_DOWN:
> +               team_port_change_check(port, false);
> +       case NETDEV_CHANGE:
> +               if (netif_running(port->dev))
> +                       team_port_change_check(port,
> +                                              !!netif_carrier_ok(port->dev));
> +               break;
> +       case NETDEV_UNREGISTER:
> +               team_del_slave(port->team->dev, dev);
> +               break;
> +       case NETDEV_FEAT_CHANGE:
> +               team_compute_features(port->team);
> +               break;
> +       case NETDEV_CHANGEMTU:
> +               /* Forbid to change mtu of underlaying device */
> +               return NOTIFY_BAD;
> +       case NETDEV_CHANGEADDR:
> +               /* Forbid to change addr of underlaying device */
> +               return NOTIFY_BAD;
> +       case NETDEV_PRE_TYPE_CHANGE:
> +               /* Forbid to change type of underlaying device */
> +               return NOTIFY_BAD;
> +       }
> +       return NOTIFY_DONE;
> +}
> +
> +static struct notifier_block team_notifier_block __read_mostly = {
> +       .notifier_call = team_device_event,
> +};
> +
> +
> +/***********************
> + * Module init and exit
> + ***********************/
> +
> +static int __init team_module_init(void)
> +{
> +       int err;
> +
> +       register_netdevice_notifier(&team_notifier_block);
> +
> +       err = rtnl_link_register(&team_link_ops);
> +       if (err)
> +               goto err_rtln_reg;
> +
> +       err = team_nl_init();
> +       if (err)
> +               goto err_nl_init;
> +
> +       return 0;
> +
> +err_nl_init:
> +       rtnl_link_unregister(&team_link_ops);
> +
> +err_rtln_reg:
> +       unregister_netdevice_notifier(&team_notifier_block);
> +
> +       return err;
> +}
> +
> +static void __exit team_module_exit(void)
> +{
> +       team_nl_fini();
> +       rtnl_link_unregister(&team_link_ops);
> +       unregister_netdevice_notifier(&team_notifier_block);
> +}
> +
> +module_init(team_module_init);
> +module_exit(team_module_exit);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
> +MODULE_DESCRIPTION("Ethernet team device driver");
> +MODULE_ALIAS_RTNL_LINK(DRV_NAME);
> diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
> new file mode 100644
> index 0000000..1aa2bfb
> --- /dev/null
> +++ b/drivers/net/team/team_mode_activebackup.c
> @@ -0,0 +1,152 @@
> +/*
> + * net/drivers/team/team_mode_activebackup.c - Active-backup mode for team
> + * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/errno.h>
> +#include <linux/netdevice.h>
> +#include <net/rtnetlink.h>
> +#include <linux/if_team.h>
> +
> +struct ab_priv {
> +       struct team_port __rcu *active_port;
> +};
> +
> +static struct ab_priv *ab_priv(struct team *team)
> +{
> +       return (struct ab_priv *) &team->mode_priv;
> +}
> +
> +static rx_handler_result_t ab_receive(struct team *team, struct team_port *port,
> +                                     struct sk_buff *skb) {
> +       struct team_port *active_port;
> +
> +       active_port = rcu_dereference(ab_priv(team)->active_port);
> +       if (active_port != port)
> +               return RX_HANDLER_EXACT;
> +       return RX_HANDLER_ANOTHER;
> +}
> +
> +static bool ab_transmit(struct team *team, struct sk_buff *skb)
> +{
> +       struct team_port *active_port;
> +
> +       active_port = rcu_dereference(ab_priv(team)->active_port);
> +       if (unlikely(!active_port))
> +               goto drop;
> +       skb->dev = active_port->dev;
> +       if (dev_queue_xmit(skb))
> +               return false;
> +       return true;
> +
> +drop:
> +       dev_kfree_skb(skb);
> +       return false;
> +}
> +
> +static void ab_port_leave(struct team *team, struct team_port *port)
> +{
> +       if (ab_priv(team)->active_port == port)
> +               rcu_assign_pointer(ab_priv(team)->active_port, NULL);
> +}
> +
> +static void ab_port_change_mac(struct team *team, struct team_port *port)
> +{
> +       if (ab_priv(team)->active_port == port)
> +               team_port_set_team_mac(port);
> +}
> +
> +static int ab_active_port_get(struct team *team, void *arg)
> +{
> +       u32 *ifindex = arg;
> +
> +       *ifindex = 0;
> +       if (ab_priv(team)->active_port)
> +               *ifindex = ab_priv(team)->active_port->dev->ifindex;
> +       return 0;
> +}
> +
> +static int ab_active_port_set(struct team *team, void *arg)
> +{
> +       u32 *ifindex = arg;
> +       struct team_port *port;
> +
> +       list_for_each_entry_rcu(port, &team->port_list, list) {
> +               if (port->dev->ifindex == *ifindex) {
> +                       struct team_port *ac_port = ab_priv(team)->active_port;
> +
> +                       /* rtnl_lock needs to be held when setting macs */
> +                       rtnl_lock();
> +                       if (ac_port)
> +                               team_port_set_orig_mac(ac_port);
> +                       rcu_assign_pointer(ab_priv(team)->active_port, port);
> +                       team_port_set_team_mac(port);
> +                       rtnl_unlock();
> +                       return 0;
> +               }
> +       }
> +       return -ENOENT;
> +}
> +
> +static struct team_option ab_options[] = {
> +       {
> +               .name = "activeport",
> +               .type = TEAM_OPTION_TYPE_U32,
> +               .getter = ab_active_port_get,
> +               .setter = ab_active_port_set,
> +       },
> +};
> +
> +int ab_init(struct team *team)
> +{
> +       team_options_register(team, ab_options, ARRAY_SIZE(ab_options));
> +       return 0;
> +}
> +
> +void ab_exit(struct team *team)
> +{
> +       team_options_unregister(team, ab_options, ARRAY_SIZE(ab_options));
> +}
> +
> +static const struct team_mode_ops ab_mode_ops = {
> +       .init                   = ab_init,
> +       .exit                   = ab_exit,
> +       .receive                = ab_receive,
> +       .transmit               = ab_transmit,
> +       .port_leave             = ab_port_leave,
> +       .port_change_mac        = ab_port_change_mac,
> +};
> +
> +static struct team_mode ab_mode = {
> +       .kind           = "activebackup",
> +       .owner          = THIS_MODULE,
> +       .priv_size      = sizeof(struct ab_priv),
> +       .ops            = &ab_mode_ops,
> +};
> +
> +static int __init ab_init_module(void)
> +{
> +       return team_mode_register(&ab_mode);
> +}
> +
> +static void __exit ab_cleanup_module(void)
> +{
> +       team_mode_unregister(&ab_mode);
> +}
> +
> +module_init(ab_init_module);
> +module_exit(ab_cleanup_module);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
> +MODULE_DESCRIPTION("Active-backup mode for team");
> +MODULE_ALIAS("team-mode-activebackup");
> diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
> new file mode 100644
> index 0000000..0374052
> --- /dev/null
> +++ b/drivers/net/team/team_mode_roundrobin.c
> @@ -0,0 +1,107 @@
> +/*
> + * net/drivers/team/team_mode_roundrobin.c - Round-robin mode for team
> + * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/types.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/errno.h>
> +#include <linux/netdevice.h>
> +#include <linux/if_team.h>
> +
> +struct rr_priv {
> +       unsigned int sent_packets;
> +};
> +
> +static struct rr_priv *rr_priv(struct team *team)
> +{
> +       return (struct rr_priv *) &team->mode_priv;
> +}
> +
> +static struct team_port *__get_first_port_up(struct team *team,
> +                                            struct team_port *port)
> +{
> +       struct team_port *cur;
> +
> +       if (port->linkup)
> +               return port;
> +       cur = port;
> +       list_for_each_entry_continue_rcu(cur, &team->port_list, list)
> +               if (cur->linkup)
> +                       return cur;
> +       list_for_each_entry_rcu(cur, &team->port_list, list) {
> +               if (cur == port)
> +                       break;
> +               if (cur->linkup)
> +                       return cur;
> +       }
> +       return NULL;
> +}
> +
> +static bool rr_transmit(struct team *team, struct sk_buff *skb)
> +{
> +       struct team_port *port;
> +       int port_index;
> +
> +       port_index = rr_priv(team)->sent_packets++ % team->port_count;
> +       port = team_get_port_by_index_rcu(team, port_index);
> +       port = __get_first_port_up(team, port);
> +       if (unlikely(!port))
> +               goto drop;
> +       skb->dev = port->dev;
> +       if (dev_queue_xmit(skb))
> +               return false;
> +       return true;
> +
> +drop:
> +       dev_kfree_skb(skb);
> +       return false;
> +}
> +
> +static int rr_port_enter(struct team *team, struct team_port *port)
> +{
> +       return team_port_set_team_mac(port);
> +}
> +
> +static void rr_port_change_mac(struct team *team, struct team_port *port)
> +{
> +       team_port_set_team_mac(port);
> +}
> +
> +static const struct team_mode_ops rr_mode_ops = {
> +       .transmit               = rr_transmit,
> +       .port_enter             = rr_port_enter,
> +       .port_change_mac        = rr_port_change_mac,
> +};
> +
> +static struct team_mode rr_mode = {
> +       .kind           = "roundrobin",
> +       .owner          = THIS_MODULE,
> +       .priv_size      = sizeof(struct rr_priv),
> +       .ops            = &rr_mode_ops,
> +};
> +
> +static int __init rr_init_module(void)
> +{
> +       return team_mode_register(&rr_mode);
> +}
> +
> +static void __exit rr_cleanup_module(void)
> +{
> +       team_mode_unregister(&rr_mode);
> +}
> +
> +module_init(rr_init_module);
> +module_exit(rr_cleanup_module);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
> +MODULE_DESCRIPTION("Round-robin mode for team");
> +MODULE_ALIAS("team-mode-roundrobin");
> diff --git a/include/linux/Kbuild b/include/linux/Kbuild
> index 619b565..0b091b3 100644
> --- a/include/linux/Kbuild
> +++ b/include/linux/Kbuild
> @@ -185,6 +185,7 @@ header-y += if_pppol2tp.h
>  header-y += if_pppox.h
>  header-y += if_slip.h
>  header-y += if_strip.h
> +header-y += if_team.h
>  header-y += if_tr.h
>  header-y += if_tun.h
>  header-y += if_tunnel.h
> diff --git a/include/linux/if.h b/include/linux/if.h
> index db20bd4..06b6ef6 100644
> --- a/include/linux/if.h
> +++ b/include/linux/if.h
> @@ -79,6 +79,7 @@
>  #define IFF_TX_SKB_SHARING     0x10000 /* The interface supports sharing
>                                         * skbs on transmit */
>  #define IFF_UNICAST_FLT        0x20000         /* Supports unicast filtering   */
> +#define IFF_TEAM_PORT  0x40000         /* device used as team port */
>
>  #define IF_GET_IFACE   0x0001          /* for querying only */
>  #define IF_GET_PROTO   0x0002
> diff --git a/include/linux/if_team.h b/include/linux/if_team.h
> new file mode 100644
> index 0000000..21581a7
> --- /dev/null
> +++ b/include/linux/if_team.h
> @@ -0,0 +1,233 @@
> +/*
> + * include/linux/if_team.h - Network team device driver header
> + * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#ifndef _LINUX_IF_TEAM_H_
> +#define _LINUX_IF_TEAM_H_
> +
> +#ifdef __KERNEL__
> +
> +struct team_pcpu_stats {
> +       u64                     rx_packets;
> +       u64                     rx_bytes;
> +       u64                     rx_multicast;
> +       u64                     tx_packets;
> +       u64                     tx_bytes;
> +       struct u64_stats_sync   syncp;
> +       u32                     rx_dropped;
> +       u32                     tx_dropped;
> +};
> +
> +struct team;
> +
> +struct team_port {
> +       struct net_device *dev;
> +       struct hlist_node hlist; /* node in hash list */
> +       struct list_head list; /* node in ordinary list */
> +       struct team *team;
> +       int index;
> +
> +       /*
> +        * A place for storing original values of the device before it
> +        * become a port.
> +        */
> +       struct {
> +               unsigned char dev_addr[MAX_ADDR_LEN];
> +               unsigned int mtu;
> +       } orig;
> +
> +       bool linkup;
> +       u32 speed;
> +       u8 duplex;
> +
> +       struct rcu_head rcu;
> +};
> +
> +struct team_mode_ops {
> +       int (*init)(struct team *team);
> +       void (*exit)(struct team *team);
> +       rx_handler_result_t (*receive)(struct team *team,
> +                                      struct team_port *port,
> +                                      struct sk_buff *skb);
> +       bool (*transmit)(struct team *team, struct sk_buff *skb);
> +       int (*port_enter)(struct team *team, struct team_port *port);
> +       void (*port_leave)(struct team *team, struct team_port *port);
> +       void (*port_change_mac)(struct team *team, struct team_port *port);
> +};
> +
> +enum team_option_type {
> +       TEAM_OPTION_TYPE_U32,
> +       TEAM_OPTION_TYPE_STRING,
> +};
> +
> +struct team_option {
> +       struct list_head list;
> +       const char *name;
> +       enum team_option_type type;
> +       int (*getter)(struct team *team, void *arg);
> +       int (*setter)(struct team *team, void *arg);
> +};
> +
> +struct team_mode {
> +       struct list_head list;
> +       const char *kind;
> +       struct module *owner;
> +       size_t priv_size;
> +       const struct team_mode_ops *ops;
> +};
> +
> +#define TEAM_MODE_PRIV_LONGS 4
> +#define TEAM_MODE_PRIV_SIZE (sizeof(long) * TEAM_MODE_PRIV_LONGS)
> +
> +struct team {
> +       struct net_device *dev; /* associated netdevice */
> +       struct team_pcpu_stats __percpu *pcpu_stats;
> +
> +       spinlock_t lock; /* used for overall locking, e.g. port lists write */
> +
> +       /*
> +        * port lists with port count
> +        */
> +       int port_count;
> +       struct hlist_head *port_hlist;
> +       struct list_head port_list;
> +
> +       struct list_head option_list;
> +
> +       const char *mode_kind;
> +       struct team_mode_ops mode_ops;
> +       long mode_priv[TEAM_MODE_PRIV_LONGS];
> +};
> +
> +#define TEAM_PORT_HASHBITS 4
> +#define TEAM_PORT_HASHENTRIES (1 << TEAM_PORT_HASHBITS)
> +
> +static inline struct hlist_head *
> +team_port_index_hash(const struct team *team,
> +                    int port_index)
> +{
> +       return &team->port_hlist[port_index & (TEAM_PORT_HASHENTRIES - 1)];
> +}
> +
> +static inline struct team_port *
> +team_get_port_by_index_rcu(const struct team *team,
> +                          int port_index)
> +{
> +       struct hlist_node *p;
> +       struct team_port *port;
> +       struct hlist_head *head = team_port_index_hash(team, port_index);
> +
> +       hlist_for_each_entry_rcu(port, p, head, hlist)
> +               if (port->index == port_index)
> +                       return port;
> +       return NULL;
> +}
> +
> +extern int team_port_set_orig_mac(struct team_port *port);
> +extern int team_port_set_team_mac(struct team_port *port);
> +extern void team_options_register(struct team *team,
> +                                 struct team_option *option,
> +                                 size_t option_count);
> +extern void team_options_unregister(struct team *team,
> +                                   struct team_option *option,
> +                                   size_t option_count);
> +extern int team_mode_register(struct team_mode *mode);
> +extern int team_mode_unregister(struct team_mode *mode);
> +
> +#endif /* __KERNEL__ */
> +
> +#define TEAM_STRING_MAX_LEN 32
> +
> +/**********************************
> + * NETLINK_GENERIC netlink family.
> + **********************************/
> +
> +enum {
> +       TEAM_CMD_NOOP,
> +       TEAM_CMD_OPTIONS_SET,
> +       TEAM_CMD_OPTIONS_GET,
> +       TEAM_CMD_PORT_LIST_GET,
> +
> +       __TEAM_CMD_MAX,
> +       TEAM_CMD_MAX = (__TEAM_CMD_MAX - 1),
> +};
> +
> +enum {
> +       TEAM_ATTR_UNSPEC,
> +       TEAM_ATTR_TEAM_IFINDEX,         /* u32 */
> +       TEAM_ATTR_LIST_OPTION,          /* nest */
> +       TEAM_ATTR_LIST_PORT,            /* nest */
> +
> +       __TEAM_ATTR_MAX,
> +       TEAM_ATTR_MAX = __TEAM_ATTR_MAX - 1,
> +};
> +
> +/* Nested layout of get/set msg:
> + *
> + *     [TEAM_ATTR_LIST_OPTION]
> + *             [TEAM_ATTR_ITEM_OPTION]
> + *                     [TEAM_ATTR_OPTION_*], ...
> + *             [TEAM_ATTR_ITEM_OPTION]
> + *                     [TEAM_ATTR_OPTION_*], ...
> + *             ...
> + *     [TEAM_ATTR_LIST_PORT]
> + *             [TEAM_ATTR_ITEM_PORT]
> + *                     [TEAM_ATTR_PORT_*], ...
> + *             [TEAM_ATTR_ITEM_PORT]
> + *                     [TEAM_ATTR_PORT_*], ...
> + *             ...
> + */
> +
> +enum {
> +       TEAM_ATTR_ITEM_OPTION_UNSPEC,
> +       TEAM_ATTR_ITEM_OPTION,          /* nest */
> +
> +       __TEAM_ATTR_ITEM_OPTION_MAX,
> +       TEAM_ATTR_ITEM_OPTION_MAX = __TEAM_ATTR_ITEM_OPTION_MAX - 1,
> +};
> +
> +enum {
> +       TEAM_ATTR_OPTION_UNSPEC,
> +       TEAM_ATTR_OPTION_NAME,          /* string */
> +       TEAM_ATTR_OPTION_CHANGED,       /* flag */
> +       TEAM_ATTR_OPTION_TYPE,          /* u8 */
> +       TEAM_ATTR_OPTION_DATA,          /* dynamic */
> +
> +       __TEAM_ATTR_OPTION_MAX,
> +       TEAM_ATTR_OPTION_MAX = __TEAM_ATTR_OPTION_MAX - 1,
> +};
> +
> +enum {
> +       TEAM_ATTR_ITEM_PORT_UNSPEC,
> +       TEAM_ATTR_ITEM_PORT,            /* nest */
> +
> +       __TEAM_ATTR_ITEM_PORT_MAX,
> +       TEAM_ATTR_ITEM_PORT_MAX = __TEAM_ATTR_ITEM_PORT_MAX - 1,
> +};
> +
> +enum {
> +       TEAM_ATTR_PORT_UNSPEC,
> +       TEAM_ATTR_PORT_IFINDEX,         /* u32 */
> +       TEAM_ATTR_PORT_CHANGED,         /* flag */
> +       TEAM_ATTR_PORT_LINKUP,          /* flag */
> +       TEAM_ATTR_PORT_SPEED,           /* u32 */
> +       TEAM_ATTR_PORT_DUPLEX,          /* u8 */
> +
> +       __TEAM_ATTR_PORT_MAX,
> +       TEAM_ATTR_PORT_MAX = __TEAM_ATTR_PORT_MAX - 1,
> +};
> +
> +/*
> + * NETLINK_GENERIC related info
> + */
> +#define TEAM_GENL_NAME "team"
> +#define TEAM_GENL_VERSION 0x1
> +#define TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME "change_event"
> +
> +#endif /* _LINUX_IF_TEAM_H_ */
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* [non-quoted-printable PATCH] Fix caif BUG() with network namespaces
From: Woodhouse, David @ 2011-10-23 21:24 UTC (permalink / raw)
  To: Sjur Brændeland, davem@redhat.com; +Cc: netdev@vger.kernel.org
In-Reply-To: <CAJK669a0cX73J0pEr3EtCkbxsaHRfxnS-56Q3Q5ANdKd47mnrg@mail.gmail.com>

The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().

If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.

However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().

I'm not entirely sure how best to fix this in the general case. Perhaps
the netdevice_notifier registration should be pernet too, rather than
global? Or perhaps we should suppress the notifier calls during
setup_net() and flush them at the end after everything has been
initialised?

But really, I'm inclined to just take the simple approach. Make
caif_device_notify() *not* go looking for its pernet data structures if
the device it's being notified about isn't a caif device in the first
place. This simple patch is sufficient to avoid the problem, and is
probably good enough.

Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

---
Sorry, forgot to disable signature on previous patch so it got sent as
QP. This should be better.

diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 7f9ac07..47fc8f3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -212,8 +212,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
 	enum cfcnfg_phy_preference pref;
 	enum cfcnfg_phy_type phy_type;
 	struct cfcnfg *cfg;
-	struct caif_device_entry_list *caifdevs =
-	    caif_device_list(dev_net(dev));
+	struct caif_device_entry_list *caifdevs;
 
 	if (dev->type != ARPHRD_CAIF)
 		return 0;
@@ -222,6 +221,8 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
 	if (cfg == NULL)
 		return 0;
 
+	caifdevs = caif_device_list(dev_net(dev));
+
 	switch (what) {
 	case NETDEV_REGISTER:
 		caifd = caif_device_alloc(dev);


                   Sent with MeeGo's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation




^ permalink raw reply related

* [PATCH] Fix caif BUG() with network namespaces
From: Woodhouse, David @ 2011-10-23 21:21 UTC (permalink / raw)
  To: Sjur Brændeland, davem@redhat.com; +Cc: netdev@vger.kernel.org
In-Reply-To: <CAJK669a0cX73J0pEr3EtCkbxsaHRfxnS-56Q3Q5ANdKd47mnrg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

The caif code will register its own pernet_operations, and then register
a netdevice_notifier. Each time the netdevice_notifier is triggered,
it'll do some stuff... including a lookup of its own pernet stuff with
net_generic().

If the net_generic() call ever returns NULL, the caif code will BUG().
That doesn't seem *so* unreasonable, I suppose — it does seem like it
should never happen.

However, it *does* happen. When we clone a network namespace,
setup_net() runs through all the pernet_operations one at a time. It
gets to loopback before it gets to caif. And loopback_net_init()
registers a netdevice... while caif hasn't been initialised. So the caif
netdevice notifier triggers, and immediately goes BUG().

I'm not entirely sure how best to fix this in the general case. Perhaps
the netdevice_notifier registration should be pernet too, rather than
global? Or perhaps we should suppress the notifier calls during
setup_net() and flush them at the end after everything has been
initialised?

But really, I'm inclined to just take the simple approach. Make
caif_device_notify() *not* go looking for its pernet data structures if
the device it's being notified about isn't a caif device in the first
place. This simple patch is sufficient to avoid the problem, and is
probably good enough.

Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 7f9ac07..47fc8f3 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -212,8 +212,7 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
 	enum cfcnfg_phy_preference pref;
 	enum cfcnfg_phy_type phy_type;
 	struct cfcnfg *cfg;
-	struct caif_device_entry_list *caifdevs =
-	    caif_device_list(dev_net(dev));
+	struct caif_device_entry_list *caifdevs;
 
 	if (dev->type != ARPHRD_CAIF)
 		return 0;
@@ -222,6 +221,8 @@ static int caif_device_notify(struct notifier_block *me, unsigned long what,
 	if (cfg == NULL)
 		return 0;
 
+	caifdevs = caif_device_list(dev_net(dev));
+
 	switch (what) {
 	case NETDEV_REGISTER:
 		caifd = caif_device_alloc(dev);


                   Sent with MeeGo's ActiveSync support.

David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation



[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4370 bytes --]

^ permalink raw reply related

* -next: NET_VENDOR_8390 dependencies
From: Geert Uytterhoeven @ 2011-10-23 21:21 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: netdev, linux-kernel

drivers/net/ethernet/8390/Kconfig:

config NET_VENDOR_8390
        bool "National Semi-conductor 8390 devices"
        default y
        depends on NET_VENDOR_NATSEMI && (AMIGA_PCMCIA || PCI || SUPERH || \
                   ISA || MCA || EISA || MAC || M32R || MACH_TX49XX || \
                   MCA_LEGACY || H8300 || ARM || MIPS || ZORRO || PCMCIA || \
                   EXPERIMENTAL)
        ---help---
          If you have a network (Ethernet) card belonging to this class, say Y
          and read the Ethernet-HOWTO, available from
          <http://www.tldp.org/docs.html#howto>.

          Note that the answer to this question doesn't directly affect the
          kernel: saying N will just cause the configurator to skip all
          the questions about Western Digital cards. If you say Y, you will be
          asked for your specific card in the following questions.

So NET_VENDOR_8390 depends on NET_VENDOR_NATSEMI.

drivers/net/ethernet/natsemi/Kconfig:

config NET_VENDOR_NATSEMI
        bool "National Semi-conductor devices"
        default y
        depends on MCA || MAC || MACH_JAZZ || PCI || XTENSA_PLATFORM_XT2000
        ---help---
          If you have a network (Ethernet) card belonging to this class, say Y
          and read the Ethernet-HOWTO, available from
          <http://www.tldp.org/docs.html#howto>.

          Note that the answer to this question doesn't directly affect the
          kernel: saying N will just cause the configurator to skip all
          the questions about National Semi-conductor devices. If you say Y,
          you will be asked for your specific card in the following questions.

But NET_VENDOR_NATSEMI will never be true for several of the other
dependencies of NET_VENDOR_8390 (e.g. AMIGA_PCMCIA, EISA, H8300, ARM,
ZORRO, PCMCIA)?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH 1/5] m68k/net: Remove obsolete IRQ_FLG_* users
From: Geert Uytterhoeven @ 2011-10-23 21:18 UTC (permalink / raw)
  To: linux-m68k, linux-kernel; +Cc: Geert Uytterhoeven, David S. Miller, netdev

The m68k core irq code stopped honoring these flags during the irq
restructuring in 2006.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
---
This is a split-up of "m68k/irq: Remove obsolete IRQ_FLG_* definitions and
users", to reduce conflicts and dependencies.

diff --git a/drivers/net/ethernet/natsemi/macsonic.c b/drivers/net/ethernet/natsemi/macsonic.c
index c93679e..aae4cfc 100644
--- a/drivers/net/ethernet/natsemi/macsonic.c
+++ b/drivers/net/ethernet/natsemi/macsonic.c
@@ -142,8 +142,7 @@ static int macsonic_open(struct net_device* dev)
 {
 	int retval;
 
-	retval = request_irq(dev->irq, sonic_interrupt, IRQ_FLG_FAST,
-				"sonic", dev);
+	retval = request_irq(dev->irq, sonic_interrupt, 0, "sonic", dev);
 	if (retval) {
 		printk(KERN_ERR "%s: unable to get IRQ %d.\n",
 				dev->name, dev->irq);
@@ -154,8 +153,8 @@ static int macsonic_open(struct net_device* dev)
 	 * rupt as well, which must prevent re-entrance of the sonic handler.
 	 */
 	if (dev->irq == IRQ_AUTO_3) {
-		retval = request_irq(IRQ_NUBUS_9, macsonic_interrupt,
-					IRQ_FLG_FAST, "sonic", dev);
+		retval = request_irq(IRQ_NUBUS_9, macsonic_interrupt, 0,
+				     "sonic", dev);
 		if (retval) {
 			printk(KERN_ERR "%s: unable to get IRQ %d.\n",
 					dev->name, IRQ_NUBUS_9);

^ permalink raw reply related

* Re: [patch net-next V3] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-10-23 17:21 UTC (permalink / raw)
  To: netdev; +Cc: davem
In-Reply-To: <1319359253-1328-1-git-send-email-jpirko@redhat.com>

Please scratch this. V4 will follow up.

Sun, Oct 23, 2011 at 10:40:53AM CEST, jpirko@redhat.com wrote:
>This patch introduces new network device called team. It supposes to be
>very fast, simple, userspace-driven alternative to existing bonding
>driver.
>
>Userspace library called libteam with couple of demo apps is available
>here:
>https://github.com/jpirko/libteam
>Note it's still in its dipers atm.
>
>team<->libteam use generic netlink for communication. That and rtnl
>suppose to be the only way to configure team device, no sysfs etc.
>
>Python binding basis for libteam was recently introduced (some need
>still need to be done on it though). Daemon providing arpmon/miimon
>active-backup functionality will be introduced shortly.
>All what's necessary is already implemented in kernel team driver.
>
>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
>v2->v3:
>	- team_change_mtu() user rcu version of list traversal to unwind
>	- set and clear of mode_ops happens per pointer, not per byte
>	- port hashlist changed to be embedded into team structure
>	- error branch in team_port_enter() does cleanup now
>	- fixed rtln->rtnl
>
>v1->v2:
>	- modes are made as modules. Makes team more modular and
>	  extendable.
>	- several commenters' nitpicks found on v1 were fixed
>	- several other bugs were fixed.
>	- note I ignored Eric's comment about roundrobin port selector
>	  as Eric's way may be easily implemented as another mode (mode
>	  "random") in future.
>---
> Documentation/networking/team.txt         |    2 +
> MAINTAINERS                               |    7 +
> drivers/net/Kconfig                       |    2 +
> drivers/net/Makefile                      |    1 +
> drivers/net/team/Kconfig                  |   38 +
> drivers/net/team/Makefile                 |    7 +
> drivers/net/team/team.c                   | 1574 +++++++++++++++++++++++++++++
> drivers/net/team/team_mode_activebackup.c |  152 +++
> drivers/net/team/team_mode_roundrobin.c   |  107 ++
> include/linux/Kbuild                      |    1 +
> include/linux/if.h                        |    1 +
> include/linux/if_team.h                   |  254 +++++
> include/linux/rculist.h                   |   14 +
> 13 files changed, 2160 insertions(+), 0 deletions(-)
> create mode 100644 Documentation/networking/team.txt
> create mode 100644 drivers/net/team/Kconfig
> create mode 100644 drivers/net/team/Makefile
> create mode 100644 drivers/net/team/team.c
> create mode 100644 drivers/net/team/team_mode_activebackup.c
> create mode 100644 drivers/net/team/team_mode_roundrobin.c
> create mode 100644 include/linux/if_team.h
>
>diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
>new file mode 100644
>index 0000000..5a01368
>--- /dev/null
>+++ b/Documentation/networking/team.txt
>@@ -0,0 +1,2 @@
>+Team devices are driven from userspace via libteam library which is here:
>+	https://github.com/jpirko/libteam
>diff --git a/MAINTAINERS b/MAINTAINERS
>index 5008b08..c33400d 100644
>--- a/MAINTAINERS
>+++ b/MAINTAINERS
>@@ -6372,6 +6372,13 @@ W:	http://tcp-lp-mod.sourceforge.net/
> S:	Maintained
> F:	net/ipv4/tcp_lp.c
> 
>+TEAM DRIVER
>+M:	Jiri Pirko <jpirko@redhat.com>
>+L:	netdev@vger.kernel.org
>+S:	Supported
>+F:	drivers/net/team/
>+F:	include/linux/if_team.h
>+
> TEGRA SUPPORT
> M:	Colin Cross <ccross@android.com>
> M:	Erik Gilling <konkers@android.com>
>diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
>index 583f66c..b3020be 100644
>--- a/drivers/net/Kconfig
>+++ b/drivers/net/Kconfig
>@@ -125,6 +125,8 @@ config IFB
> 	  'ifb1' etc.
> 	  Look at the iproute2 documentation directory for usage etc
> 
>+source "drivers/net/team/Kconfig"
>+
> config MACVLAN
> 	tristate "MAC-VLAN support (EXPERIMENTAL)"
> 	depends on EXPERIMENTAL
>diff --git a/drivers/net/Makefile b/drivers/net/Makefile
>index fa877cd..4e4ebfe 100644
>--- a/drivers/net/Makefile
>+++ b/drivers/net/Makefile
>@@ -17,6 +17,7 @@ obj-$(CONFIG_NET) += Space.o loopback.o
> obj-$(CONFIG_NETCONSOLE) += netconsole.o
> obj-$(CONFIG_PHYLIB) += phy/
> obj-$(CONFIG_RIONET) += rionet.o
>+obj-$(CONFIG_NET_TEAM) += team/
> obj-$(CONFIG_TUN) += tun.o
> obj-$(CONFIG_VETH) += veth.o
> obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
>diff --git a/drivers/net/team/Kconfig b/drivers/net/team/Kconfig
>new file mode 100644
>index 0000000..70a43a6
>--- /dev/null
>+++ b/drivers/net/team/Kconfig
>@@ -0,0 +1,38 @@
>+menuconfig NET_TEAM
>+	tristate "Ethernet team driver support (EXPERIMENTAL)"
>+	depends on EXPERIMENTAL
>+	---help---
>+	  This allows one to create virtual interfaces that teams together
>+	  multiple ethernet devices.
>+
>+	  Team devices can be added using the "ip" command from the
>+	  iproute2 package:
>+
>+	  "ip link add link [ address MAC ] [ NAME ] type team"
>+
>+	  To compile this driver as a module, choose M here: the module
>+	  will be called team.
>+
>+if NET_TEAM
>+
>+config NET_TEAM_MODE_ROUNDROBIN
>+	tristate "Round-robin mode support"
>+	depends on NET_TEAM
>+	---help---
>+	  Basic mode where port used for transmitting packets is selected in
>+	  round-robin fashion using packet counter.
>+
>+	  To compile this team mode as a module, choose M here: the module
>+	  will be called team_mode_roundrobin.
>+
>+config NET_TEAM_MODE_ACTIVEBACKUP
>+	tristate "Active-backup mode support"
>+	depends on NET_TEAM
>+	---help---
>+	  Only one port is active at a time and the rest of ports are used
>+	  for backup.
>+
>+	  To compile this team mode as a module, choose M here: the module
>+	  will be called team_mode_activebackup.
>+
>+endif # NET_TEAM
>diff --git a/drivers/net/team/Makefile b/drivers/net/team/Makefile
>new file mode 100644
>index 0000000..85f2028
>--- /dev/null
>+++ b/drivers/net/team/Makefile
>@@ -0,0 +1,7 @@
>+#
>+# Makefile for the network team driver
>+#
>+
>+obj-$(CONFIG_NET_TEAM) += team.o
>+obj-$(CONFIG_NET_TEAM_MODE_ROUNDROBIN) += team_mode_roundrobin.o
>+obj-$(CONFIG_NET_TEAM_MODE_ACTIVEBACKUP) += team_mode_activebackup.o
>diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
>new file mode 100644
>index 0000000..8004916
>--- /dev/null
>+++ b/drivers/net/team/team.c
>@@ -0,0 +1,1574 @@
>+/*
>+ * net/drivers/team/team.c - Network team device driver
>+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+ */
>+
>+#include <linux/kernel.h>
>+#include <linux/types.h>
>+#include <linux/module.h>
>+#include <linux/init.h>
>+#include <linux/slab.h>
>+#include <linux/rcupdate.h>
>+#include <linux/errno.h>
>+#include <linux/ctype.h>
>+#include <linux/notifier.h>
>+#include <linux/netdevice.h>
>+#include <linux/if_arp.h>
>+#include <linux/socket.h>
>+#include <linux/etherdevice.h>
>+#include <linux/rtnetlink.h>
>+#include <net/rtnetlink.h>
>+#include <net/genetlink.h>
>+#include <net/netlink.h>
>+#include <linux/if_team.h>
>+
>+#define DRV_NAME "team"
>+
>+
>+/**********
>+ * Helpers
>+ **********/
>+
>+#define team_port_exists(dev) (dev->priv_flags & IFF_TEAM_PORT)
>+
>+static struct team_port *team_port_get_rcu(const struct net_device *dev)
>+{
>+	struct team_port *port = rcu_dereference(dev->rx_handler_data);
>+
>+	return team_port_exists(dev) ? port : NULL;
>+}
>+
>+static struct team_port *team_port_get_rtnl(const struct net_device *dev)
>+{
>+	struct team_port *port = rtnl_dereference(dev->rx_handler_data);
>+
>+	return team_port_exists(dev) ? port : NULL;
>+}
>+
>+/*
>+ * Since the ability to change mac address for open port device is tested in
>+ * team_port_add, this function can be called without control of return value
>+ */
>+static int __set_port_mac(struct net_device *port_dev,
>+			  const unsigned char *dev_addr)
>+{
>+	struct sockaddr addr;
>+
>+	memcpy(addr.sa_data, dev_addr, ETH_ALEN);
>+	addr.sa_family = ARPHRD_ETHER;
>+	return dev_set_mac_address(port_dev, &addr);
>+}
>+
>+int team_port_set_orig_mac(struct team_port *port)
>+{
>+	return __set_port_mac(port->dev, port->orig.dev_addr);
>+}
>+EXPORT_SYMBOL(team_port_set_orig_mac);
>+
>+int team_port_set_team_mac(struct team_port *port)
>+{
>+	return __set_port_mac(port->dev, port->team->dev->dev_addr);
>+}
>+EXPORT_SYMBOL(team_port_set_team_mac);
>+
>+
>+/*******************
>+ * Options handling
>+ *******************/
>+
>+void team_options_register(struct team *team, struct team_option *option,
>+			   size_t option_count)
>+{
>+	int i;
>+
>+	for (i = 0; i < option_count; i++, option++)
>+		list_add_tail(&option->list, &team->option_list);
>+}
>+EXPORT_SYMBOL(team_options_register);
>+
>+static void __team_options_change_check(struct team *team,
>+					struct team_option *changed_option);
>+
>+static void __team_options_unregister(struct team *team,
>+				      struct team_option *option,
>+				      size_t option_count)
>+{
>+	int i;
>+
>+	for (i = 0; i < option_count; i++, option++)
>+		list_del(&option->list);
>+}
>+
>+void team_options_unregister(struct team *team, struct team_option *option,
>+			     size_t option_count)
>+{
>+	__team_options_unregister(team, option, option_count);
>+	__team_options_change_check(team, NULL);
>+}
>+EXPORT_SYMBOL(team_options_unregister);
>+
>+static int team_option_get(struct team *team, struct team_option *option,
>+			   void *arg)
>+{
>+	return option->getter(team, arg);
>+}
>+
>+static int team_option_set(struct team *team, struct team_option *option,
>+			   void *arg)
>+{
>+	int err;
>+
>+	err = option->setter(team, arg);
>+	if (err)
>+		return err;
>+
>+	__team_options_change_check(team, option);
>+	return err;
>+}
>+
>+/****************
>+ * Mode handling
>+ ****************/
>+
>+static LIST_HEAD(mode_list);
>+static DEFINE_SPINLOCK(mode_list_lock);
>+
>+static struct team_mode *__find_mode(const char *kind)
>+{
>+	struct team_mode *mode;
>+
>+	list_for_each_entry(mode, &mode_list, list) {
>+		if (strcmp(mode->kind, kind) == 0)
>+			return mode;
>+	}
>+	return NULL;
>+}
>+
>+static bool is_good_mode_name(const char *name)
>+{
>+	while (*name != '\0') {
>+		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
>+			return false;
>+		name++;
>+	}
>+	return true;
>+}
>+
>+int team_mode_register(struct team_mode *mode)
>+{
>+	int err = 0;
>+
>+	if (!is_good_mode_name(mode->kind) ||
>+	    mode->priv_size > TEAM_MODE_PRIV_SIZE)
>+		return -EINVAL;
>+	spin_lock(&mode_list_lock);
>+	if (__find_mode(mode->kind)) {
>+		err = -EEXIST;
>+		goto unlock;
>+	}
>+	list_add_tail(&mode->list, &mode_list);
>+unlock:
>+	spin_unlock(&mode_list_lock);
>+	return err;
>+}
>+EXPORT_SYMBOL(team_mode_register);
>+
>+int team_mode_unregister(struct team_mode *mode)
>+{
>+	spin_lock(&mode_list_lock);
>+	list_del_init(&mode->list);
>+	spin_unlock(&mode_list_lock);
>+	return 0;
>+}
>+EXPORT_SYMBOL(team_mode_unregister);
>+
>+static struct team_mode *team_mode_get(const char *kind)
>+{
>+	struct team_mode *mode;
>+
>+	spin_lock(&mode_list_lock);
>+	mode = __find_mode(kind);
>+	if (!mode) {
>+		spin_unlock(&mode_list_lock);
>+		request_module("team-mode-%s", kind);
>+		spin_lock(&mode_list_lock);
>+		mode = __find_mode(kind);
>+	}
>+	if (mode)
>+		if (!try_module_get(mode->owner))
>+			mode = NULL;
>+
>+	spin_unlock(&mode_list_lock);
>+	return mode;
>+}
>+
>+static void team_mode_put(const char *kind)
>+{
>+	struct team_mode *mode;
>+
>+	spin_lock(&mode_list_lock);
>+	mode = __find_mode(kind);
>+	BUG_ON(!mode);
>+	module_put(mode->owner);
>+	spin_unlock(&mode_list_lock);
>+}
>+
>+/*
>+ * We can benefit from the fact that it's ensured no port is present
>+ * at the time of mode change.
>+ */
>+static int __team_change_mode(struct team *team,
>+			      const struct team_mode *new_mode)
>+{
>+	/* Check if mode was previously set and do cleanup if so */
>+	if (team->mode_kind) {
>+		void (*exit_op)(struct team *team) = team->mode_ops.exit;
>+
>+		/* Clear ops area so no callback is called any longer */
>+		team_mode_ops_clear(&team->mode_ops);
>+
>+		synchronize_rcu();
>+
>+		if (exit_op)
>+			exit_op(team);
>+		team_mode_put(team->mode_kind);
>+		team->mode_kind = NULL;
>+		/* zero private data area */
>+		memset(&team->mode_priv, 0,
>+		       sizeof(struct team) - offsetof(struct team, mode_priv));
>+	}
>+
>+	if (!new_mode)
>+		return 0;
>+
>+	if (new_mode->ops->init) {
>+		int err;
>+
>+		err = new_mode->ops->init(team);
>+		if (err)
>+			return err;
>+	}
>+
>+	team->mode_kind = new_mode->kind;
>+	team_mode_ops_copy(&team->mode_ops, new_mode->ops);
>+
>+	return 0;
>+}
>+
>+static int team_change_mode(struct team *team, const char *kind)
>+{
>+	struct team_mode *new_mode;
>+	struct net_device *dev = team->dev;
>+	int err;
>+
>+	if (!list_empty(&team->port_list)) {
>+		netdev_err(dev, "No ports can be present during mode change\n");
>+		return -EBUSY;
>+	}
>+
>+	if (team->mode_kind && strcmp(team->mode_kind, kind) == 0) {
>+		netdev_err(dev, "Unable to change to the same mode the team is in\n");
>+		return -EINVAL;
>+	}
>+
>+	new_mode = team_mode_get(kind);
>+	if (!new_mode) {
>+		netdev_err(dev, "Mode \"%s\" not found\n", kind);
>+		return -EINVAL;
>+	}
>+
>+	err = __team_change_mode(team, new_mode);
>+	if (err) {
>+		netdev_err(dev, "Failed to change to mode \"%s\"\n", kind);
>+		team_mode_put(kind);
>+		return err;
>+	}
>+
>+	netdev_info(dev, "Mode changed to \"%s\"\n", kind);
>+	return 0;
>+}
>+
>+
>+/************************
>+ * Rx path frame handler
>+ ************************/
>+
>+/* note: already called with rcu_read_lock */
>+static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
>+{
>+	struct sk_buff *skb = *pskb;
>+	struct team_port *port;
>+	struct team *team;
>+	rx_handler_result_t res = RX_HANDLER_ANOTHER;
>+
>+	skb = skb_share_check(skb, GFP_ATOMIC);
>+	if (!skb)
>+		return RX_HANDLER_CONSUMED;
>+
>+	*pskb = skb;
>+
>+	port = team_port_get_rcu(skb->dev);
>+	team = port->team;
>+
>+	if (team->mode_ops.receive)
>+		res = team->mode_ops.receive(team, port, skb);
>+
>+	if (res == RX_HANDLER_ANOTHER) {
>+		struct team_pcpu_stats *pcpu_stats;
>+
>+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
>+		u64_stats_update_begin(&pcpu_stats->syncp);
>+		pcpu_stats->rx_packets++;
>+		pcpu_stats->rx_bytes += skb->len;
>+		if (skb->pkt_type == PACKET_MULTICAST)
>+			pcpu_stats->rx_multicast++;
>+		u64_stats_update_end(&pcpu_stats->syncp);
>+
>+		skb->dev = team->dev;
>+	} else {
>+		this_cpu_inc(team->pcpu_stats->rx_dropped);
>+	}
>+
>+	return res;
>+}
>+
>+
>+/****************
>+ * Port handling
>+ ****************/
>+
>+static bool team_port_find(const struct team *team,
>+			   const struct team_port *port)
>+{
>+	struct team_port *cur;
>+
>+	list_for_each_entry(cur, &team->port_list, list)
>+		if (cur == port)
>+			return true;
>+	return false;
>+}
>+
>+/*
>+ * Add/delete port to the team port list. Write guarded by rtnl_lock.
>+ * Takes care of correct port->index setup (might be racy).
>+ */
>+static void team_port_list_add_port(struct team *team,
>+				    struct team_port *port)
>+{
>+	port->index = team->port_count++;
>+	hlist_add_head_rcu(&port->hlist,
>+			   team_port_index_hash(team, port->index));
>+	list_add_tail_rcu(&port->list, &team->port_list);
>+}
>+
>+static void __reconstruct_port_hlist(struct team *team, int rm_index)
>+{
>+	int i;
>+	struct team_port *port;
>+
>+	for (i = rm_index + 1; i < team->port_count; i++) {
>+		port = team_get_port_by_index_rcu(team, i);
>+		hlist_del_rcu(&port->hlist);
>+		port->index--;
>+		hlist_add_head_rcu(&port->hlist,
>+				   team_port_index_hash(team, port->index));
>+	}
>+}
>+
>+static void team_port_list_del_port(struct team *team,
>+				   struct team_port *port)
>+{
>+	int rm_index = port->index;
>+
>+	hlist_del_rcu(&port->hlist);
>+	list_del_rcu(&port->list);
>+	__reconstruct_port_hlist(team, rm_index);
>+	team->port_count--;
>+}
>+
>+#define TEAM_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
>+			    NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
>+			    NETIF_F_HIGHDMA | NETIF_F_LRO)
>+
>+static void __team_compute_features(struct team *team)
>+{
>+	struct team_port *port;
>+	u32 vlan_features = TEAM_VLAN_FEATURES;
>+	unsigned short max_hard_header_len = ETH_HLEN;
>+
>+	list_for_each_entry(port, &team->port_list, list) {
>+		vlan_features = netdev_increment_features(vlan_features,
>+					port->dev->vlan_features,
>+					TEAM_VLAN_FEATURES);
>+
>+		if (port->dev->hard_header_len > max_hard_header_len)
>+			max_hard_header_len = port->dev->hard_header_len;
>+	}
>+
>+	team->dev->vlan_features = vlan_features;
>+	team->dev->hard_header_len = max_hard_header_len;
>+
>+	netdev_change_features(team->dev);
>+}
>+
>+static void team_compute_features(struct team *team)
>+{
>+	spin_lock(&team->lock);
>+	__team_compute_features(team);
>+	spin_unlock(&team->lock);
>+}
>+
>+static int team_port_enter(struct team *team, struct team_port *port)
>+{
>+	int err = 0;
>+
>+	dev_hold(team->dev);
>+	port->dev->priv_flags |= IFF_TEAM_PORT;
>+	if (team->mode_ops.port_enter) {
>+		err = team->mode_ops.port_enter(team, port);
>+		if (err) {
>+			netdev_err(team->dev, "Device %s failed to enter team mode\n",
>+				   port->dev->name);
>+			goto err_port_enter;
>+		}
>+	}
>+
>+	return 0;
>+
>+err_port_enter:
>+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
>+	dev_put(team->dev);
>+
>+	return err;
>+}
>+
>+static void team_port_leave(struct team *team, struct team_port *port)
>+{
>+	if (team->mode_ops.port_leave)
>+		team->mode_ops.port_leave(team, port);
>+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
>+	dev_put(team->dev);
>+}
>+
>+static void __team_port_change_check(struct team_port *port, bool linkup);
>+
>+static int team_port_add(struct team *team, struct net_device *port_dev)
>+{
>+	struct net_device *dev = team->dev;
>+	struct team_port *port;
>+	char *portname = port_dev->name;
>+	char tmp_addr[ETH_ALEN];
>+	int err;
>+
>+	if (port_dev->flags & IFF_LOOPBACK ||
>+	    port_dev->type != ARPHRD_ETHER) {
>+		netdev_err(dev, "Device %s is of an unsupported type\n",
>+			   portname);
>+		return -EINVAL;
>+	}
>+
>+	if (team_port_exists(port_dev)) {
>+		netdev_err(dev, "Device %s is already a port "
>+				"of a team device\n", portname);
>+		return -EBUSY;
>+	}
>+
>+	if (port_dev->flags & IFF_UP) {
>+		netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
>+			   portname);
>+		return -EBUSY;
>+	}
>+
>+	port = kzalloc(sizeof(struct team_port), GFP_KERNEL);
>+	if (!port)
>+		return -ENOMEM;
>+
>+	port->dev = port_dev;
>+	port->team = team;
>+
>+	port->orig.mtu = port_dev->mtu;
>+	err = dev_set_mtu(port_dev, dev->mtu);
>+	if (err) {
>+		netdev_dbg(dev, "Error %d calling dev_set_mtu\n", err);
>+		goto err_set_mtu;
>+	}
>+
>+	memcpy(port->orig.dev_addr, port_dev->dev_addr, ETH_ALEN);
>+	random_ether_addr(tmp_addr);
>+	err = __set_port_mac(port_dev, tmp_addr);
>+	if (err) {
>+		netdev_dbg(dev, "Device %s mac addr set failed\n",
>+			   portname);
>+		goto err_set_mac_rand;
>+	}
>+
>+	err = dev_open(port_dev);
>+	if (err) {
>+		netdev_dbg(dev, "Device %s opening failed\n",
>+			   portname);
>+		goto err_dev_open;
>+	}
>+
>+	err = team_port_set_orig_mac(port);
>+	if (err) {
>+		netdev_dbg(dev, "Device %s mac addr set failed - Device does not support addr change when it's opened\n",
>+			   portname);
>+		goto err_set_mac_opened;
>+	}
>+
>+	err = team_port_enter(team, port);
>+	if (err) {
>+		netdev_err(dev, "Device %s failed to enter team mode\n",
>+			   portname);
>+		goto err_port_enter;
>+	}
>+
>+	err = netdev_set_master(port_dev, dev);
>+	if (err) {
>+		netdev_err(dev, "Device %s failed to set master\n", portname);
>+		goto err_set_master;
>+	}
>+
>+	err = netdev_rx_handler_register(port_dev, team_handle_frame,
>+					 port);
>+	if (err) {
>+		netdev_err(dev, "Device %s failed to register rx_handler\n",
>+			   portname);
>+		goto err_handler_register;
>+	}
>+
>+	team_port_list_add_port(team, port);
>+	__team_compute_features(team);
>+	__team_port_change_check(port, !!netif_carrier_ok(port_dev));
>+
>+	netdev_info(dev, "Port device %s added\n", portname);
>+
>+	return 0;
>+
>+err_handler_register:
>+	netdev_set_master(port_dev, NULL);
>+
>+err_set_master:
>+	team_port_leave(team, port);
>+
>+err_port_enter:
>+err_set_mac_opened:
>+	dev_close(port_dev);
>+
>+err_dev_open:
>+	team_port_set_orig_mac(port);
>+
>+err_set_mac_rand:
>+	dev_set_mtu(port_dev, port->orig.mtu);
>+
>+err_set_mtu:
>+	kfree(port);
>+
>+	return err;
>+}
>+
>+static int team_port_del(struct team *team, struct net_device *port_dev)
>+{
>+	struct net_device *dev = team->dev;
>+	struct team_port *port;
>+	char *portname = port_dev->name;
>+
>+	port = team_port_get_rtnl(port_dev);
>+	if (!port || !team_port_find(team, port)) {
>+		netdev_err(dev, "Device %s does not act as a port of this team\n",
>+			   portname);
>+		return -ENOENT;
>+	}
>+
>+	__team_port_change_check(port, false);
>+	team_port_list_del_port(team, port);
>+	netdev_rx_handler_unregister(port_dev);
>+	netdev_set_master(port_dev, NULL);
>+	team_port_leave(team, port);
>+	dev_close(port_dev);
>+	team_port_set_orig_mac(port);
>+	dev_set_mtu(port_dev, port->orig.mtu);
>+	synchronize_rcu();
>+	kfree(port);
>+	netdev_info(dev, "Port device %s removed\n", portname);
>+	__team_compute_features(team);
>+
>+	return 0;
>+}
>+
>+
>+/*****************
>+ * Net device ops
>+ *****************/
>+
>+static const char team_no_mode_kind[] = "*NOMODE*";
>+
>+static int team_mode_option_get(struct team *team, void *arg)
>+{
>+	const char **str = arg;
>+
>+	*str = team->mode_kind ? team->mode_kind : team_no_mode_kind;
>+	return 0;
>+}
>+
>+static int team_mode_option_set(struct team *team, void *arg)
>+{
>+	const char **str = arg;
>+
>+	return team_change_mode(team, *str);
>+}
>+
>+static struct team_option team_options[] = {
>+	{
>+		.name = "mode",
>+		.type = TEAM_OPTION_TYPE_STRING,
>+		.getter = team_mode_option_get,
>+		.setter = team_mode_option_set,
>+	},
>+};
>+
>+static int team_init(struct net_device *dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	int i;
>+
>+	team->dev = dev;
>+	spin_lock_init(&team->lock);
>+
>+	team->pcpu_stats = alloc_percpu(struct team_pcpu_stats);
>+	if (!team->pcpu_stats)
>+		return -ENOMEM;
>+
>+	for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
>+		INIT_HLIST_HEAD(&team->port_hlist[i]);
>+	INIT_LIST_HEAD(&team->port_list);
>+
>+	INIT_LIST_HEAD(&team->option_list);
>+	team_options_register(team, team_options, ARRAY_SIZE(team_options));
>+	netif_carrier_off(dev);
>+
>+	return 0;
>+}
>+
>+static void team_uninit(struct net_device *dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+	struct team_port *tmp;
>+
>+	spin_lock(&team->lock);
>+	list_for_each_entry_safe(port, tmp, &team->port_list, list)
>+		team_port_del(team, port->dev);
>+
>+	__team_change_mode(team, NULL); /* cleanup */
>+	__team_options_unregister(team, team_options, ARRAY_SIZE(team_options));
>+	spin_unlock(&team->lock);
>+}
>+
>+static void team_destructor(struct net_device *dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+
>+	free_percpu(team->pcpu_stats);
>+	free_netdev(dev);
>+}
>+
>+static int team_open(struct net_device *dev)
>+{
>+	netif_carrier_on(dev);
>+	return 0;
>+}
>+
>+static int team_close(struct net_device *dev)
>+{
>+	netif_carrier_off(dev);
>+	return 0;
>+}
>+
>+/*
>+ * note: already called with rcu_read_lock
>+ */
>+static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	bool tx_success = false;
>+	unsigned int len = skb->len;
>+
>+	/*
>+	 * Ensure transmit function is called only in case there is at least
>+	 * one port present.
>+	 */
>+	if (likely(!list_empty(&team->port_list) && team->mode_ops.transmit))
>+		tx_success = team->mode_ops.transmit(team, skb);
>+	if (tx_success) {
>+		struct team_pcpu_stats *pcpu_stats;
>+
>+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
>+		u64_stats_update_begin(&pcpu_stats->syncp);
>+		pcpu_stats->tx_packets++;
>+		pcpu_stats->tx_bytes += len;
>+		u64_stats_update_end(&pcpu_stats->syncp);
>+	} else {
>+		this_cpu_inc(team->pcpu_stats->tx_dropped);
>+	}
>+
>+	return NETDEV_TX_OK;
>+}
>+
>+static void team_change_rx_flags(struct net_device *dev, int change)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+	int inc;
>+
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		if (change & IFF_PROMISC) {
>+			inc = dev->flags & IFF_PROMISC ? 1 : -1;
>+			dev_set_promiscuity(port->dev, inc);
>+		}
>+		if (change & IFF_ALLMULTI) {
>+			inc = dev->flags & IFF_ALLMULTI ? 1 : -1;
>+			dev_set_allmulti(port->dev, inc);
>+		}
>+	}
>+	rcu_read_unlock();
>+}
>+
>+static void team_set_rx_mode(struct net_device *dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		dev_uc_sync(port->dev, dev);
>+		dev_mc_sync(port->dev, dev);
>+	}
>+	rcu_read_unlock();
>+}
>+
>+static int team_set_mac_address(struct net_device *dev, void *p)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+	struct sockaddr *addr = p;
>+
>+	memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list)
>+		if (team->mode_ops.port_change_mac)
>+			team->mode_ops.port_change_mac(team, port);
>+	rcu_read_unlock();
>+	return 0;
>+}
>+
>+static int team_change_mtu(struct net_device *dev, int new_mtu)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+	int err;
>+
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		err = dev_set_mtu(port->dev, new_mtu);
>+		if (err) {
>+			netdev_err(dev, "Device %s failed to change mtu",
>+				   port->dev->name);
>+			goto unwind;
>+		}
>+	}
>+	rcu_read_unlock();
>+
>+	dev->mtu = new_mtu;
>+
>+	return 0;
>+
>+unwind:
>+	list_for_each_entry_continue_reverse_rcu(port, &team->port_list, list)
>+		dev_set_mtu(port->dev, dev->mtu);
>+
>+	rcu_read_unlock();
>+	return err;
>+}
>+
>+static struct rtnl_link_stats64 *
>+team_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_pcpu_stats *p;
>+	u64 rx_packets, rx_bytes, rx_multicast, tx_packets, tx_bytes;
>+	u32 rx_dropped = 0, tx_dropped = 0;
>+	unsigned int start;
>+	int i;
>+
>+	for_each_possible_cpu(i) {
>+		p = per_cpu_ptr(team->pcpu_stats, i);
>+		do {
>+			start = u64_stats_fetch_begin_bh(&p->syncp);
>+			rx_packets	= p->rx_packets;
>+			rx_bytes	= p->rx_bytes;
>+			rx_multicast	= p->rx_multicast;
>+			tx_packets	= p->tx_packets;
>+			tx_bytes	= p->tx_bytes;
>+		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
>+
>+		stats->rx_packets	+= rx_packets;
>+		stats->rx_bytes		+= rx_bytes;
>+		stats->multicast	+= rx_multicast;
>+		stats->tx_packets	+= tx_packets;
>+		stats->tx_bytes		+= tx_bytes;
>+		/*
>+		 * rx_dropped & tx_dropped are u32, updated
>+		 * without syncp protection.
>+		 */
>+		rx_dropped	+= p->rx_dropped;
>+		tx_dropped	+= p->tx_dropped;
>+	}
>+	stats->rx_dropped	= rx_dropped;
>+	stats->tx_dropped	= tx_dropped;
>+	return stats;
>+}
>+
>+static void team_vlan_rx_add_vid(struct net_device *dev, uint16_t vid)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		const struct net_device_ops *ops = port->dev->netdev_ops;
>+
>+		ops->ndo_vlan_rx_add_vid(port->dev, vid);
>+	}
>+	rcu_read_unlock();
>+}
>+
>+static void team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
>+{
>+	struct team *team = netdev_priv(dev);
>+	struct team_port *port;
>+
>+	rcu_read_lock();
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		const struct net_device_ops *ops = port->dev->netdev_ops;
>+
>+		ops->ndo_vlan_rx_kill_vid(port->dev, vid);
>+	}
>+	rcu_read_unlock();
>+}
>+
>+static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	int err;
>+
>+	spin_lock(&team->lock);
>+	err = team_port_add(team, port_dev);
>+	spin_unlock(&team->lock);
>+	return err;
>+}
>+
>+static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
>+{
>+	struct team *team = netdev_priv(dev);
>+	int err;
>+
>+	spin_lock(&team->lock);
>+	err = team_port_del(team, port_dev);
>+	spin_unlock(&team->lock);
>+	return err;
>+}
>+
>+static const struct net_device_ops team_netdev_ops = {
>+	.ndo_init		= team_init,
>+	.ndo_uninit		= team_uninit,
>+	.ndo_open		= team_open,
>+	.ndo_stop		= team_close,
>+	.ndo_start_xmit		= team_xmit,
>+	.ndo_change_rx_flags	= team_change_rx_flags,
>+	.ndo_set_rx_mode	= team_set_rx_mode,
>+	.ndo_set_mac_address	= team_set_mac_address,
>+	.ndo_change_mtu		= team_change_mtu,
>+	.ndo_get_stats64	= team_get_stats64,
>+	.ndo_vlan_rx_add_vid	= team_vlan_rx_add_vid,
>+	.ndo_vlan_rx_kill_vid	= team_vlan_rx_kill_vid,
>+	.ndo_add_slave		= team_add_slave,
>+	.ndo_del_slave		= team_del_slave,
>+};
>+
>+
>+/***********************
>+ * rt netlink interface
>+ ***********************/
>+
>+static void team_setup(struct net_device *dev)
>+{
>+	ether_setup(dev);
>+
>+	dev->netdev_ops = &team_netdev_ops;
>+	dev->destructor	= team_destructor;
>+	dev->tx_queue_len = 0;
>+	dev->flags |= IFF_MULTICAST;
>+	dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
>+
>+	/*
>+	 * Indicate we support unicast address filtering. That way core won't
>+	 * bring us to promisc mode in case a unicast addr is added.
>+	 * Let this up to underlay drivers.
>+	 */
>+	dev->priv_flags |= IFF_UNICAST_FLT;
>+
>+	dev->features |= NETIF_F_LLTX;
>+	dev->features |= NETIF_F_GRO;
>+	dev->hw_features = NETIF_F_HW_VLAN_TX |
>+			   NETIF_F_HW_VLAN_RX |
>+			   NETIF_F_HW_VLAN_FILTER;
>+
>+	dev->features |= dev->hw_features;
>+}
>+
>+static int team_newlink(struct net *src_net, struct net_device *dev,
>+			struct nlattr *tb[], struct nlattr *data[])
>+{
>+	int err;
>+
>+	if (tb[IFLA_ADDRESS] == NULL)
>+		random_ether_addr(dev->dev_addr);
>+
>+	err = register_netdevice(dev);
>+	if (err)
>+		return err;
>+
>+	return 0;
>+}
>+
>+static int team_validate(struct nlattr *tb[], struct nlattr *data[])
>+{
>+	if (tb[IFLA_ADDRESS]) {
>+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
>+			return -EINVAL;
>+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
>+			return -EADDRNOTAVAIL;
>+	}
>+	return 0;
>+}
>+
>+static struct rtnl_link_ops team_link_ops __read_mostly = {
>+	.kind		= DRV_NAME,
>+	.priv_size	= sizeof(struct team),
>+	.setup		= team_setup,
>+	.newlink	= team_newlink,
>+	.validate	= team_validate,
>+};
>+
>+
>+/***********************************
>+ * Generic netlink custom interface
>+ ***********************************/
>+
>+static struct genl_family team_nl_family = {
>+	.id		= GENL_ID_GENERATE,
>+	.name		= TEAM_GENL_NAME,
>+	.version	= TEAM_GENL_VERSION,
>+	.maxattr	= TEAM_ATTR_MAX,
>+	.netnsok	= true,
>+};
>+
>+static const struct nla_policy team_nl_policy[TEAM_ATTR_MAX + 1] = {
>+	[TEAM_ATTR_UNSPEC]			= { .type = NLA_UNSPEC, },
>+	[TEAM_ATTR_TEAM_IFINDEX]		= { .type = NLA_U32 },
>+	[TEAM_ATTR_LIST_OPTION]			= { .type = NLA_NESTED },
>+	[TEAM_ATTR_LIST_PORT]			= { .type = NLA_NESTED },
>+};
>+
>+static const struct nla_policy
>+team_nl_option_policy[TEAM_ATTR_OPTION_MAX + 1] = {
>+	[TEAM_ATTR_OPTION_UNSPEC]		= { .type = NLA_UNSPEC, },
>+	[TEAM_ATTR_OPTION_NAME] = {
>+		.type = NLA_STRING,
>+		.len = TEAM_STRING_MAX_LEN,
>+	},
>+	[TEAM_ATTR_OPTION_CHANGED]		= { .type = NLA_FLAG },
>+	[TEAM_ATTR_OPTION_TYPE]			= { .type = NLA_U8 },
>+	[TEAM_ATTR_OPTION_DATA] = {
>+		.type = NLA_BINARY,
>+		.len = TEAM_STRING_MAX_LEN,
>+	},
>+};
>+
>+static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
>+{
>+	struct sk_buff *msg;
>+	void *hdr;
>+	int err;
>+
>+	msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
>+	if (!msg)
>+		return -ENOMEM;
>+
>+	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
>+			  &team_nl_family, 0, TEAM_CMD_NOOP);
>+	if (IS_ERR(hdr)) {
>+		err = PTR_ERR(hdr);
>+		goto err_msg_put;
>+	}
>+
>+	genlmsg_end(msg, hdr);
>+
>+	return genlmsg_unicast(genl_info_net(info), msg, info->snd_pid);
>+
>+err_msg_put:
>+	nlmsg_free(msg);
>+
>+	return err;
>+}
>+
>+/*
>+ * Netlink cmd functions should be locked by following two functions.
>+ * To ensure team_uninit would not be called in between, hold rcu_read_lock
>+ * all the time.
>+ */
>+static struct team *team_nl_team_get(struct genl_info *info)
>+{
>+	struct net *net = genl_info_net(info);
>+	int ifindex;
>+	struct net_device *dev;
>+	struct team *team;
>+
>+	if (!info->attrs[TEAM_ATTR_TEAM_IFINDEX])
>+		return NULL;
>+
>+	ifindex = nla_get_u32(info->attrs[TEAM_ATTR_TEAM_IFINDEX]);
>+	rcu_read_lock();
>+	dev = dev_get_by_index_rcu(net, ifindex);
>+	if (!dev || dev->netdev_ops != &team_netdev_ops) {
>+		rcu_read_unlock();
>+		return NULL;
>+	}
>+
>+	team = netdev_priv(dev);
>+	spin_lock(&team->lock);
>+	return team;
>+}
>+
>+static void team_nl_team_put(struct team *team)
>+{
>+	spin_unlock(&team->lock);
>+	rcu_read_unlock();
>+}
>+
>+static int team_nl_send_generic(struct genl_info *info, struct team *team,
>+				int (*fill_func)(struct sk_buff *skb,
>+						 struct genl_info *info,
>+						 int flags, struct team *team))
>+{
>+	struct sk_buff *skb;
>+	int err;
>+
>+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
>+	if (!skb)
>+		return -ENOMEM;
>+
>+	err = fill_func(skb, info, NLM_F_ACK, team);
>+	if (err < 0)
>+		goto err_fill;
>+
>+	err = genlmsg_unicast(genl_info_net(info), skb, info->snd_pid);
>+	return err;
>+
>+err_fill:
>+	nlmsg_free(skb);
>+	return err;
>+}
>+
>+static int team_nl_fill_options_get_changed(struct sk_buff *skb,
>+					    u32 pid, u32 seq, int flags,
>+					    struct team *team,
>+					    struct team_option *changed_option)
>+{
>+	struct nlattr *option_list;
>+	void *hdr;
>+	struct team_option *option;
>+
>+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
>+			  TEAM_CMD_OPTIONS_GET);
>+	if (IS_ERR(hdr))
>+		return PTR_ERR(hdr);
>+
>+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
>+	option_list = nla_nest_start(skb, TEAM_ATTR_LIST_OPTION);
>+	if (!option_list)
>+		return -EMSGSIZE;
>+
>+	list_for_each_entry(option, &team->option_list, list) {
>+		struct nlattr *option_item;
>+		long arg;
>+
>+		option_item = nla_nest_start(skb, TEAM_ATTR_ITEM_OPTION);
>+		if (!option_item)
>+			goto nla_put_failure;
>+		NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_NAME, option->name);
>+		if (option == changed_option)
>+			NLA_PUT_FLAG(skb, TEAM_ATTR_OPTION_CHANGED);
>+		switch (option->type) {
>+		case TEAM_OPTION_TYPE_U32:
>+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_U32);
>+			team_option_get(team, option, &arg);
>+			NLA_PUT_U32(skb, TEAM_ATTR_OPTION_DATA, arg);
>+			break;
>+		case TEAM_OPTION_TYPE_STRING:
>+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_STRING);
>+			team_option_get(team, option, &arg);
>+			NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_DATA,
>+				       (char *) arg);
>+			break;
>+		default:
>+			BUG();
>+		}
>+		nla_nest_end(skb, option_item);
>+	}
>+
>+	nla_nest_end(skb, option_list);
>+	return genlmsg_end(skb, hdr);
>+
>+nla_put_failure:
>+	genlmsg_cancel(skb, hdr);
>+	return -EMSGSIZE;
>+}
>+
>+static int team_nl_fill_options_get(struct sk_buff *skb,
>+				    struct genl_info *info, int flags,
>+				    struct team *team)
>+{
>+	return team_nl_fill_options_get_changed(skb, info->snd_pid,
>+						info->snd_seq, NLM_F_ACK,
>+						team, NULL);
>+}
>+
>+static int team_nl_cmd_options_get(struct sk_buff *skb, struct genl_info *info)
>+{
>+	struct team *team;
>+	int err;
>+
>+	team = team_nl_team_get(info);
>+	if (!team)
>+		return -EINVAL;
>+
>+	err = team_nl_send_generic(info, team, team_nl_fill_options_get);
>+
>+	team_nl_team_put(team);
>+
>+	return err;
>+}
>+
>+static int team_nl_cmd_options_set(struct sk_buff *skb, struct genl_info *info)
>+{
>+	struct team *team;
>+	int err = 0;
>+	int i;
>+	struct nlattr *nl_option;
>+
>+	team = team_nl_team_get(info);
>+	if (!team)
>+		return -EINVAL;
>+
>+	err = -EINVAL;
>+	if (!info->attrs[TEAM_ATTR_LIST_OPTION]) {
>+		err = -EINVAL;
>+		goto team_put;
>+	}
>+
>+	nla_for_each_nested(nl_option, info->attrs[TEAM_ATTR_LIST_OPTION], i) {
>+		struct nlattr *mode_attrs[TEAM_ATTR_OPTION_MAX + 1];
>+		enum team_option_type opt_type;
>+		struct team_option *option;
>+		char *opt_name;
>+		bool opt_found = false;
>+
>+		if (nla_type(nl_option) != TEAM_ATTR_ITEM_OPTION) {
>+			err = -EINVAL;
>+			goto team_put;
>+		}
>+		err = nla_parse_nested(mode_attrs, TEAM_ATTR_OPTION_MAX,
>+				       nl_option, team_nl_option_policy);
>+		if (err)
>+			goto team_put;
>+		if (!mode_attrs[TEAM_ATTR_OPTION_NAME] ||
>+		    !mode_attrs[TEAM_ATTR_OPTION_TYPE] ||
>+		    !mode_attrs[TEAM_ATTR_OPTION_DATA]) {
>+			err = -EINVAL;
>+			goto team_put;
>+		}
>+		switch (nla_get_u8(mode_attrs[TEAM_ATTR_OPTION_TYPE])) {
>+		case NLA_U32:
>+			opt_type = TEAM_OPTION_TYPE_U32;
>+			break;
>+		case NLA_STRING:
>+			opt_type = TEAM_OPTION_TYPE_STRING;
>+			break;
>+		default:
>+			goto team_put;
>+		}
>+
>+		opt_name = nla_data(mode_attrs[TEAM_ATTR_OPTION_NAME]);
>+		list_for_each_entry(option, &team->option_list, list) {
>+			long arg;
>+			struct nlattr *opt_data_attr;
>+
>+			if (option->type != opt_type ||
>+			    strcmp(option->name, opt_name))
>+				continue;
>+			opt_found = true;
>+			opt_data_attr = mode_attrs[TEAM_ATTR_OPTION_DATA];
>+			switch (opt_type) {
>+			case TEAM_OPTION_TYPE_U32:
>+				arg = nla_get_u32(opt_data_attr);
>+				break;
>+			case TEAM_OPTION_TYPE_STRING:
>+				arg = (long) nla_data(opt_data_attr);
>+				break;
>+			default:
>+				BUG();
>+			}
>+			err = team_option_set(team, option, &arg);
>+			if (err)
>+				goto team_put;
>+		}
>+		if (!opt_found) {
>+			err = -ENOENT;
>+			goto team_put;
>+		}
>+	}
>+
>+team_put:
>+	team_nl_team_put(team);
>+
>+	return err;
>+}
>+
>+static int team_nl_fill_port_list_get_changed(struct sk_buff *skb,
>+					      u32 pid, u32 seq, int flags,
>+					      struct team *team,
>+					      struct team_port *changed_port)
>+{
>+	struct nlattr *port_list;
>+	void *hdr;
>+	struct team_port *port;
>+
>+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
>+			  TEAM_CMD_PORT_LIST_GET);
>+	if (IS_ERR(hdr))
>+		return PTR_ERR(hdr);
>+
>+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
>+	port_list = nla_nest_start(skb, TEAM_ATTR_LIST_PORT);
>+	if (!port_list)
>+		return -EMSGSIZE;
>+
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		struct nlattr *port_item;
>+
>+		port_item = nla_nest_start(skb, TEAM_ATTR_ITEM_PORT);
>+		if (!port_item)
>+			goto nla_put_failure;
>+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_IFINDEX, port->dev->ifindex);
>+		if (port == changed_port)
>+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_CHANGED);
>+		if (port->linkup)
>+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_LINKUP);
>+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_SPEED, port->speed);
>+		NLA_PUT_U8(skb, TEAM_ATTR_PORT_DUPLEX, port->duplex);
>+		nla_nest_end(skb, port_item);
>+	}
>+
>+	nla_nest_end(skb, port_list);
>+	return genlmsg_end(skb, hdr);
>+
>+nla_put_failure:
>+	genlmsg_cancel(skb, hdr);
>+	return -EMSGSIZE;
>+}
>+
>+static int team_nl_fill_port_list_get(struct sk_buff *skb,
>+				      struct genl_info *info, int flags,
>+				      struct team *team)
>+{
>+	return team_nl_fill_port_list_get_changed(skb, info->snd_pid,
>+						  info->snd_seq, NLM_F_ACK,
>+						  team, NULL);
>+}
>+
>+static int team_nl_cmd_port_list_get(struct sk_buff *skb,
>+				     struct genl_info *info)
>+{
>+	struct team *team;
>+	int err;
>+
>+	team = team_nl_team_get(info);
>+	if (!team)
>+		return -EINVAL;
>+
>+	err = team_nl_send_generic(info, team, team_nl_fill_port_list_get);
>+
>+	team_nl_team_put(team);
>+
>+	return err;
>+}
>+
>+static struct genl_ops team_nl_ops[] = {
>+	{
>+		.cmd = TEAM_CMD_NOOP,
>+		.doit = team_nl_cmd_noop,
>+		.policy = team_nl_policy,
>+	},
>+	{
>+		.cmd = TEAM_CMD_OPTIONS_SET,
>+		.doit = team_nl_cmd_options_set,
>+		.policy = team_nl_policy,
>+		.flags = GENL_ADMIN_PERM,
>+	},
>+	{
>+		.cmd = TEAM_CMD_OPTIONS_GET,
>+		.doit = team_nl_cmd_options_get,
>+		.policy = team_nl_policy,
>+		.flags = GENL_ADMIN_PERM,
>+	},
>+	{
>+		.cmd = TEAM_CMD_PORT_LIST_GET,
>+		.doit = team_nl_cmd_port_list_get,
>+		.policy = team_nl_policy,
>+		.flags = GENL_ADMIN_PERM,
>+	},
>+};
>+
>+static struct genl_multicast_group team_change_event_mcgrp = {
>+	.name = TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME,
>+};
>+
>+static int team_nl_send_event_options_get(struct team *team,
>+					  struct team_option *changed_option)
>+{
>+	struct sk_buff *skb;
>+	int err;
>+	struct net *net = dev_net(team->dev);
>+
>+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
>+	if (!skb)
>+		return -ENOMEM;
>+
>+	err = team_nl_fill_options_get_changed(skb, 0, 0, 0, team,
>+					       changed_option);
>+	if (err < 0)
>+		goto err_fill;
>+
>+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
>+				      GFP_KERNEL);
>+	return err;
>+
>+err_fill:
>+	nlmsg_free(skb);
>+	return err;
>+}
>+
>+static int team_nl_send_event_port_list_get(struct team_port *port)
>+{
>+	struct sk_buff *skb;
>+	int err;
>+	struct net *net = dev_net(port->team->dev);
>+
>+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
>+	if (!skb)
>+		return -ENOMEM;
>+
>+	err = team_nl_fill_port_list_get_changed(skb, 0, 0, 0,
>+						 port->team, port);
>+	if (err < 0)
>+		goto err_fill;
>+
>+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
>+				      GFP_KERNEL);
>+	return err;
>+
>+err_fill:
>+	nlmsg_free(skb);
>+	return err;
>+}
>+
>+static int team_nl_init(void)
>+{
>+	int err;
>+
>+	err = genl_register_family_with_ops(&team_nl_family, team_nl_ops,
>+					    ARRAY_SIZE(team_nl_ops));
>+	if (err)
>+		return err;
>+
>+	err = genl_register_mc_group(&team_nl_family, &team_change_event_mcgrp);
>+	if (err)
>+		goto err_change_event_grp_reg;
>+
>+	return 0;
>+
>+err_change_event_grp_reg:
>+	genl_unregister_family(&team_nl_family);
>+
>+	return err;
>+}
>+
>+static void team_nl_fini(void)
>+{
>+	genl_unregister_family(&team_nl_family);
>+}
>+
>+
>+/******************
>+ * Change checkers
>+ ******************/
>+
>+static void __team_options_change_check(struct team *team,
>+					struct team_option *changed_option)
>+{
>+	int err;
>+
>+	err = team_nl_send_event_options_get(team, changed_option);
>+	if (err)
>+		netdev_warn(team->dev, "Failed to send options change via netlink\n");
>+}
>+
>+/* rtnl lock is held */
>+static void __team_port_change_check(struct team_port *port, bool linkup)
>+{
>+	int err;
>+
>+	if (port->linkup == linkup)
>+		return;
>+
>+	port->linkup = linkup;
>+	if (linkup) {
>+		struct ethtool_cmd ecmd;
>+
>+		err = __ethtool_get_settings(port->dev, &ecmd);
>+		if (!err) {
>+			port->speed = ethtool_cmd_speed(&ecmd);
>+			port->duplex = ecmd.duplex;
>+			goto send_event;
>+		}
>+	}
>+	port->speed = 0;
>+	port->duplex = 0;
>+
>+send_event:
>+	err = team_nl_send_event_port_list_get(port);
>+	if (err)
>+		netdev_warn(port->team->dev, "Failed to send port change of device %s via netlink\n",
>+			    port->dev->name);
>+
>+}
>+
>+static void team_port_change_check(struct team_port *port, bool linkup)
>+{
>+	struct team *team = port->team;
>+
>+	spin_lock(&team->lock);
>+	__team_port_change_check(port, linkup);
>+	spin_unlock(&team->lock);
>+}
>+
>+/************************************
>+ * Net device notifier event handler
>+ ************************************/
>+
>+static int team_device_event(struct notifier_block *unused,
>+			     unsigned long event, void *ptr)
>+{
>+	struct net_device *dev = (struct net_device *) ptr;
>+	struct team_port *port;
>+
>+	port = team_port_get_rtnl(dev);
>+	if (!port)
>+		return NOTIFY_DONE;
>+
>+	switch (event) {
>+	case NETDEV_UP:
>+		if (netif_carrier_ok(dev))
>+			team_port_change_check(port, true);
>+	case NETDEV_DOWN:
>+		team_port_change_check(port, false);
>+	case NETDEV_CHANGE:
>+		if (netif_running(port->dev))
>+			team_port_change_check(port,
>+					       !!netif_carrier_ok(port->dev));
>+		break;
>+	case NETDEV_UNREGISTER:
>+		team_del_slave(port->team->dev, dev);
>+		break;
>+	case NETDEV_FEAT_CHANGE:
>+		team_compute_features(port->team);
>+		break;
>+	case NETDEV_CHANGEMTU:
>+		/* Forbid to change mtu of underlaying device */
>+		return NOTIFY_BAD;
>+	case NETDEV_CHANGEADDR:
>+		/* Forbid to change addr of underlaying device */
>+		return NOTIFY_BAD;
>+	case NETDEV_PRE_TYPE_CHANGE:
>+		/* Forbid to change type of underlaying device */
>+		return NOTIFY_BAD;
>+	}
>+	return NOTIFY_DONE;
>+}
>+
>+static struct notifier_block team_notifier_block __read_mostly = {
>+	.notifier_call = team_device_event,
>+};
>+
>+
>+/***********************
>+ * Module init and exit
>+ ***********************/
>+
>+static int __init team_module_init(void)
>+{
>+	int err;
>+
>+	register_netdevice_notifier(&team_notifier_block);
>+
>+	err = rtnl_link_register(&team_link_ops);
>+	if (err)
>+		goto err_rtnl_reg;
>+
>+	err = team_nl_init();
>+	if (err)
>+		goto err_nl_init;
>+
>+	return 0;
>+
>+err_nl_init:
>+	rtnl_link_unregister(&team_link_ops);
>+
>+err_rtnl_reg:
>+	unregister_netdevice_notifier(&team_notifier_block);
>+
>+	return err;
>+}
>+
>+static void __exit team_module_exit(void)
>+{
>+	team_nl_fini();
>+	rtnl_link_unregister(&team_link_ops);
>+	unregister_netdevice_notifier(&team_notifier_block);
>+}
>+
>+module_init(team_module_init);
>+module_exit(team_module_exit);
>+
>+MODULE_LICENSE("GPL v2");
>+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
>+MODULE_DESCRIPTION("Ethernet team device driver");
>+MODULE_ALIAS_RTNL_LINK(DRV_NAME);
>diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
>new file mode 100644
>index 0000000..1aa2bfb
>--- /dev/null
>+++ b/drivers/net/team/team_mode_activebackup.c
>@@ -0,0 +1,152 @@
>+/*
>+ * net/drivers/team/team_mode_activebackup.c - Active-backup mode for team
>+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+ */
>+
>+#include <linux/kernel.h>
>+#include <linux/types.h>
>+#include <linux/module.h>
>+#include <linux/init.h>
>+#include <linux/errno.h>
>+#include <linux/netdevice.h>
>+#include <net/rtnetlink.h>
>+#include <linux/if_team.h>
>+
>+struct ab_priv {
>+	struct team_port __rcu *active_port;
>+};
>+
>+static struct ab_priv *ab_priv(struct team *team)
>+{
>+	return (struct ab_priv *) &team->mode_priv;
>+}
>+
>+static rx_handler_result_t ab_receive(struct team *team, struct team_port *port,
>+				      struct sk_buff *skb) {
>+	struct team_port *active_port;
>+
>+	active_port = rcu_dereference(ab_priv(team)->active_port);
>+	if (active_port != port)
>+		return RX_HANDLER_EXACT;
>+	return RX_HANDLER_ANOTHER;
>+}
>+
>+static bool ab_transmit(struct team *team, struct sk_buff *skb)
>+{
>+	struct team_port *active_port;
>+
>+	active_port = rcu_dereference(ab_priv(team)->active_port);
>+	if (unlikely(!active_port))
>+		goto drop;
>+	skb->dev = active_port->dev;
>+	if (dev_queue_xmit(skb))
>+		return false;
>+	return true;
>+
>+drop:
>+	dev_kfree_skb(skb);
>+	return false;
>+}
>+
>+static void ab_port_leave(struct team *team, struct team_port *port)
>+{
>+	if (ab_priv(team)->active_port == port)
>+		rcu_assign_pointer(ab_priv(team)->active_port, NULL);
>+}
>+
>+static void ab_port_change_mac(struct team *team, struct team_port *port)
>+{
>+	if (ab_priv(team)->active_port == port)
>+		team_port_set_team_mac(port);
>+}
>+
>+static int ab_active_port_get(struct team *team, void *arg)
>+{
>+	u32 *ifindex = arg;
>+
>+	*ifindex = 0;
>+	if (ab_priv(team)->active_port)
>+		*ifindex = ab_priv(team)->active_port->dev->ifindex;
>+	return 0;
>+}
>+
>+static int ab_active_port_set(struct team *team, void *arg)
>+{
>+	u32 *ifindex = arg;
>+	struct team_port *port;
>+
>+	list_for_each_entry_rcu(port, &team->port_list, list) {
>+		if (port->dev->ifindex == *ifindex) {
>+			struct team_port *ac_port = ab_priv(team)->active_port;
>+
>+			/* rtnl_lock needs to be held when setting macs */
>+			rtnl_lock();
>+			if (ac_port)
>+				team_port_set_orig_mac(ac_port);
>+			rcu_assign_pointer(ab_priv(team)->active_port, port);
>+			team_port_set_team_mac(port);
>+			rtnl_unlock();
>+			return 0;
>+		}
>+	}
>+	return -ENOENT;
>+}
>+
>+static struct team_option ab_options[] = {
>+	{
>+		.name = "activeport",
>+		.type = TEAM_OPTION_TYPE_U32,
>+		.getter = ab_active_port_get,
>+		.setter = ab_active_port_set,
>+	},
>+};
>+
>+int ab_init(struct team *team)
>+{
>+	team_options_register(team, ab_options, ARRAY_SIZE(ab_options));
>+	return 0;
>+}
>+
>+void ab_exit(struct team *team)
>+{
>+	team_options_unregister(team, ab_options, ARRAY_SIZE(ab_options));
>+}
>+
>+static const struct team_mode_ops ab_mode_ops = {
>+	.init			= ab_init,
>+	.exit			= ab_exit,
>+	.receive		= ab_receive,
>+	.transmit		= ab_transmit,
>+	.port_leave		= ab_port_leave,
>+	.port_change_mac	= ab_port_change_mac,
>+};
>+
>+static struct team_mode ab_mode = {
>+	.kind		= "activebackup",
>+	.owner		= THIS_MODULE,
>+	.priv_size	= sizeof(struct ab_priv),
>+	.ops		= &ab_mode_ops,
>+};
>+
>+static int __init ab_init_module(void)
>+{
>+	return team_mode_register(&ab_mode);
>+}
>+
>+static void __exit ab_cleanup_module(void)
>+{
>+	team_mode_unregister(&ab_mode);
>+}
>+
>+module_init(ab_init_module);
>+module_exit(ab_cleanup_module);
>+
>+MODULE_LICENSE("GPL v2");
>+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
>+MODULE_DESCRIPTION("Active-backup mode for team");
>+MODULE_ALIAS("team-mode-activebackup");
>diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
>new file mode 100644
>index 0000000..0374052
>--- /dev/null
>+++ b/drivers/net/team/team_mode_roundrobin.c
>@@ -0,0 +1,107 @@
>+/*
>+ * net/drivers/team/team_mode_roundrobin.c - Round-robin mode for team
>+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+ */
>+
>+#include <linux/kernel.h>
>+#include <linux/types.h>
>+#include <linux/module.h>
>+#include <linux/init.h>
>+#include <linux/errno.h>
>+#include <linux/netdevice.h>
>+#include <linux/if_team.h>
>+
>+struct rr_priv {
>+	unsigned int sent_packets;
>+};
>+
>+static struct rr_priv *rr_priv(struct team *team)
>+{
>+	return (struct rr_priv *) &team->mode_priv;
>+}
>+
>+static struct team_port *__get_first_port_up(struct team *team,
>+					     struct team_port *port)
>+{
>+	struct team_port *cur;
>+
>+	if (port->linkup)
>+		return port;
>+	cur = port;
>+	list_for_each_entry_continue_rcu(cur, &team->port_list, list)
>+		if (cur->linkup)
>+			return cur;
>+	list_for_each_entry_rcu(cur, &team->port_list, list) {
>+		if (cur == port)
>+			break;
>+		if (cur->linkup)
>+			return cur;
>+	}
>+	return NULL;
>+}
>+
>+static bool rr_transmit(struct team *team, struct sk_buff *skb)
>+{
>+	struct team_port *port;
>+	int port_index;
>+
>+	port_index = rr_priv(team)->sent_packets++ % team->port_count;
>+	port = team_get_port_by_index_rcu(team, port_index);
>+	port = __get_first_port_up(team, port);
>+	if (unlikely(!port))
>+		goto drop;
>+	skb->dev = port->dev;
>+	if (dev_queue_xmit(skb))
>+		return false;
>+	return true;
>+
>+drop:
>+	dev_kfree_skb(skb);
>+	return false;
>+}
>+
>+static int rr_port_enter(struct team *team, struct team_port *port)
>+{
>+	return team_port_set_team_mac(port);
>+}
>+
>+static void rr_port_change_mac(struct team *team, struct team_port *port)
>+{
>+	team_port_set_team_mac(port);
>+}
>+
>+static const struct team_mode_ops rr_mode_ops = {
>+	.transmit		= rr_transmit,
>+	.port_enter		= rr_port_enter,
>+	.port_change_mac	= rr_port_change_mac,
>+};
>+
>+static struct team_mode rr_mode = {
>+	.kind		= "roundrobin",
>+	.owner		= THIS_MODULE,
>+	.priv_size	= sizeof(struct rr_priv),
>+	.ops		= &rr_mode_ops,
>+};
>+
>+static int __init rr_init_module(void)
>+{
>+	return team_mode_register(&rr_mode);
>+}
>+
>+static void __exit rr_cleanup_module(void)
>+{
>+	team_mode_unregister(&rr_mode);
>+}
>+
>+module_init(rr_init_module);
>+module_exit(rr_cleanup_module);
>+
>+MODULE_LICENSE("GPL v2");
>+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
>+MODULE_DESCRIPTION("Round-robin mode for team");
>+MODULE_ALIAS("team-mode-roundrobin");
>diff --git a/include/linux/Kbuild b/include/linux/Kbuild
>index 619b565..0b091b3 100644
>--- a/include/linux/Kbuild
>+++ b/include/linux/Kbuild
>@@ -185,6 +185,7 @@ header-y += if_pppol2tp.h
> header-y += if_pppox.h
> header-y += if_slip.h
> header-y += if_strip.h
>+header-y += if_team.h
> header-y += if_tr.h
> header-y += if_tun.h
> header-y += if_tunnel.h
>diff --git a/include/linux/if.h b/include/linux/if.h
>index db20bd4..06b6ef6 100644
>--- a/include/linux/if.h
>+++ b/include/linux/if.h
>@@ -79,6 +79,7 @@
> #define IFF_TX_SKB_SHARING	0x10000	/* The interface supports sharing
> 					 * skbs on transmit */
> #define IFF_UNICAST_FLT	0x20000		/* Supports unicast filtering	*/
>+#define IFF_TEAM_PORT	0x40000		/* device used as team port */
> 
> #define IF_GET_IFACE	0x0001		/* for querying only */
> #define IF_GET_PROTO	0x0002
>diff --git a/include/linux/if_team.h b/include/linux/if_team.h
>new file mode 100644
>index 0000000..de395fc
>--- /dev/null
>+++ b/include/linux/if_team.h
>@@ -0,0 +1,254 @@
>+/*
>+ * include/linux/if_team.h - Network team device driver header
>+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
>+ *
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+ */
>+
>+#ifndef _LINUX_IF_TEAM_H_
>+#define _LINUX_IF_TEAM_H_
>+
>+#ifdef __KERNEL__
>+
>+struct team_pcpu_stats {
>+	u64			rx_packets;
>+	u64			rx_bytes;
>+	u64			rx_multicast;
>+	u64			tx_packets;
>+	u64			tx_bytes;
>+	struct u64_stats_sync	syncp;
>+	u32			rx_dropped;
>+	u32			tx_dropped;
>+};
>+
>+struct team;
>+
>+struct team_port {
>+	struct net_device *dev;
>+	struct hlist_node hlist; /* node in hash list */
>+	struct list_head list; /* node in ordinary list */
>+	struct team *team;
>+	int index;
>+
>+	/*
>+	 * A place for storing original values of the device before it
>+	 * become a port.
>+	 */
>+	struct {
>+		unsigned char dev_addr[MAX_ADDR_LEN];
>+		unsigned int mtu;
>+	} orig;
>+
>+	bool linkup;
>+	u32 speed;
>+	u8 duplex;
>+
>+	struct rcu_head rcu;
>+};
>+
>+struct team_mode_ops {
>+	int (*init)(struct team *team);
>+	void (*exit)(struct team *team);
>+	rx_handler_result_t (*receive)(struct team *team,
>+				       struct team_port *port,
>+				       struct sk_buff *skb);
>+	bool (*transmit)(struct team *team, struct sk_buff *skb);
>+	int (*port_enter)(struct team *team, struct team_port *port);
>+	void (*port_leave)(struct team *team, struct team_port *port);
>+	void (*port_change_mac)(struct team *team, struct team_port *port);
>+};
>+
>+static inline void team_mode_ops_copy(struct team_mode_ops *dst,
>+				      const struct team_mode_ops *src)
>+{
>+	dst->init		= src->init;
>+	dst->exit		= src->exit;
>+	dst->receive		= src->receive;
>+	dst->transmit		= src->transmit;
>+	dst->port_enter		= src->port_enter;
>+	dst->port_leave		= src->port_leave;
>+	dst->port_change_mac	= src->port_change_mac;
>+}
>+
>+static inline void team_mode_ops_clear(struct team_mode_ops *dst)
>+{
>+	dst->init		= NULL;
>+	dst->exit		= NULL;
>+	dst->receive		= NULL;
>+	dst->transmit		= NULL;
>+	dst->port_enter		= NULL;
>+	dst->port_leave		= NULL;
>+	dst->port_change_mac	= NULL;
>+}
>+
>+enum team_option_type {
>+	TEAM_OPTION_TYPE_U32,
>+	TEAM_OPTION_TYPE_STRING,
>+};
>+
>+struct team_option {
>+	struct list_head list;
>+	const char *name;
>+	enum team_option_type type;
>+	int (*getter)(struct team *team, void *arg);
>+	int (*setter)(struct team *team, void *arg);
>+};
>+
>+struct team_mode {
>+	struct list_head list;
>+	const char *kind;
>+	struct module *owner;
>+	size_t priv_size;
>+	const struct team_mode_ops *ops;
>+};
>+
>+#define TEAM_PORT_HASHBITS 4
>+#define TEAM_PORT_HASHENTRIES (1 << TEAM_PORT_HASHBITS)
>+
>+#define TEAM_MODE_PRIV_LONGS 4
>+#define TEAM_MODE_PRIV_SIZE (sizeof(long) * TEAM_MODE_PRIV_LONGS)
>+
>+struct team {
>+	struct net_device *dev; /* associated netdevice */
>+	struct team_pcpu_stats __percpu *pcpu_stats;
>+
>+	spinlock_t lock; /* used for overall locking, e.g. port lists write */
>+
>+	/*
>+	 * port lists with port count
>+	 */
>+	int port_count;
>+	struct hlist_head port_hlist[TEAM_PORT_HASHENTRIES];
>+	struct list_head port_list;
>+
>+	struct list_head option_list;
>+
>+	const char *mode_kind;
>+	struct team_mode_ops mode_ops;
>+	long mode_priv[TEAM_MODE_PRIV_LONGS];
>+};
>+
>+static inline struct hlist_head *team_port_index_hash(struct team *team,
>+						      int port_index)
>+{
>+	return &team->port_hlist[port_index & (TEAM_PORT_HASHENTRIES - 1)];
>+}
>+
>+static inline struct team_port *team_get_port_by_index_rcu(struct team *team,
>+							   int port_index)
>+{
>+	struct hlist_node *p;
>+	struct team_port *port;
>+	struct hlist_head *head = team_port_index_hash(team, port_index);
>+
>+	hlist_for_each_entry_rcu(port, p, head, hlist)
>+		if (port->index == port_index)
>+			return port;
>+	return NULL;
>+}
>+
>+extern int team_port_set_orig_mac(struct team_port *port);
>+extern int team_port_set_team_mac(struct team_port *port);
>+extern void team_options_register(struct team *team,
>+				  struct team_option *option,
>+				  size_t option_count);
>+extern void team_options_unregister(struct team *team,
>+				    struct team_option *option,
>+				    size_t option_count);
>+extern int team_mode_register(struct team_mode *mode);
>+extern int team_mode_unregister(struct team_mode *mode);
>+
>+#endif /* __KERNEL__ */
>+
>+#define TEAM_STRING_MAX_LEN 32
>+
>+/**********************************
>+ * NETLINK_GENERIC netlink family.
>+ **********************************/
>+
>+enum {
>+	TEAM_CMD_NOOP,
>+	TEAM_CMD_OPTIONS_SET,
>+	TEAM_CMD_OPTIONS_GET,
>+	TEAM_CMD_PORT_LIST_GET,
>+
>+	__TEAM_CMD_MAX,
>+	TEAM_CMD_MAX = (__TEAM_CMD_MAX - 1),
>+};
>+
>+enum {
>+	TEAM_ATTR_UNSPEC,
>+	TEAM_ATTR_TEAM_IFINDEX,		/* u32 */
>+	TEAM_ATTR_LIST_OPTION,		/* nest */
>+	TEAM_ATTR_LIST_PORT,		/* nest */
>+
>+	__TEAM_ATTR_MAX,
>+	TEAM_ATTR_MAX = __TEAM_ATTR_MAX - 1,
>+};
>+
>+/* Nested layout of get/set msg:
>+ *
>+ *	[TEAM_ATTR_LIST_OPTION]
>+ *		[TEAM_ATTR_ITEM_OPTION]
>+ *			[TEAM_ATTR_OPTION_*], ...
>+ *		[TEAM_ATTR_ITEM_OPTION]
>+ *			[TEAM_ATTR_OPTION_*], ...
>+ *		...
>+ *	[TEAM_ATTR_LIST_PORT]
>+ *		[TEAM_ATTR_ITEM_PORT]
>+ *			[TEAM_ATTR_PORT_*], ...
>+ *		[TEAM_ATTR_ITEM_PORT]
>+ *			[TEAM_ATTR_PORT_*], ...
>+ *		...
>+ */
>+
>+enum {
>+	TEAM_ATTR_ITEM_OPTION_UNSPEC,
>+	TEAM_ATTR_ITEM_OPTION,		/* nest */
>+
>+	__TEAM_ATTR_ITEM_OPTION_MAX,
>+	TEAM_ATTR_ITEM_OPTION_MAX = __TEAM_ATTR_ITEM_OPTION_MAX - 1,
>+};
>+
>+enum {
>+	TEAM_ATTR_OPTION_UNSPEC,
>+	TEAM_ATTR_OPTION_NAME,		/* string */
>+	TEAM_ATTR_OPTION_CHANGED,	/* flag */
>+	TEAM_ATTR_OPTION_TYPE,		/* u8 */
>+	TEAM_ATTR_OPTION_DATA,		/* dynamic */
>+
>+	__TEAM_ATTR_OPTION_MAX,
>+	TEAM_ATTR_OPTION_MAX = __TEAM_ATTR_OPTION_MAX - 1,
>+};
>+
>+enum {
>+	TEAM_ATTR_ITEM_PORT_UNSPEC,
>+	TEAM_ATTR_ITEM_PORT,		/* nest */
>+
>+	__TEAM_ATTR_ITEM_PORT_MAX,
>+	TEAM_ATTR_ITEM_PORT_MAX = __TEAM_ATTR_ITEM_PORT_MAX - 1,
>+};
>+
>+enum {
>+	TEAM_ATTR_PORT_UNSPEC,
>+	TEAM_ATTR_PORT_IFINDEX,		/* u32 */
>+	TEAM_ATTR_PORT_CHANGED,		/* flag */
>+	TEAM_ATTR_PORT_LINKUP,		/* flag */
>+	TEAM_ATTR_PORT_SPEED,		/* u32 */
>+	TEAM_ATTR_PORT_DUPLEX,		/* u8 */
>+
>+	__TEAM_ATTR_PORT_MAX,
>+	TEAM_ATTR_PORT_MAX = __TEAM_ATTR_PORT_MAX - 1,
>+};
>+
>+/*
>+ * NETLINK_GENERIC related info
>+ */
>+#define TEAM_GENL_NAME "team"
>+#define TEAM_GENL_VERSION 0x1
>+#define TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME "change_event"
>+
>+#endif /* _LINUX_IF_TEAM_H_ */
>diff --git a/include/linux/rculist.h b/include/linux/rculist.h
>index d079290..7586b2c 100644
>--- a/include/linux/rculist.h
>+++ b/include/linux/rculist.h
>@@ -288,6 +288,20 @@ static inline void list_splice_init_rcu(struct list_head *list,
> 	     pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
> 
> /**
>+ * list_for_each_entry_continue_reverse_rcu - iterate backwards from the given point
>+ * @pos:	the type * to use as a loop cursor.
>+ * @head:	the head for your list.
>+ * @member:	the name of the list_struct within the struct.
>+ *
>+ * Start to iterate over list of given type backwards, continuing after
>+ * the current position.
>+ */
>+#define list_for_each_entry_continue_reverse_rcu(pos, head, member)	\
>+	for (pos = list_entry_rcu(pos->member.prev, typeof(*pos), member); \
>+	     &pos->member != (head);	\
>+	     pos = list_entry_rcu(pos->member.prev, typeof(*pos), member))
>+
>+/**
>  * hlist_del_rcu - deletes entry from hash list without re-initialization
>  * @n: the element to delete from the hash list.
>  *
>-- 
>1.7.6
>

^ permalink raw reply

* [PATCH] ipv6: Do not use routes from locally generated RAs
From: Andreas Hofmeister @ 2011-10-23 16:41 UTC (permalink / raw)
  To: netdev

When hybrid mode is enabled (accept_ra == 2), the kernel also sees RAs
generated locally. This is useful since it allows the kernel to auto-configure
its own interface addresses.

However, if 'accept_ra_defrtr' and/or 'accept_ra_rtr_pref' are set and the
locally generated RAs announce the default route and/or other route information,
the kernel happily inserts bogus routes with its own address as gateway.

With this patch, adding routes from an RA will be skiped when the RAs source
address matches any local address, just as if 'accept_ra_defrtr' and
'accept_ra_rtr_pref' were set to 0.
---
 net/ipv6/ndisc.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 67501b6..00fa46e1 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1226,6 +1226,9 @@ static void ndisc_router_discovery(struct sk_buff *skb)
 	if (!in6_dev->cnf.accept_ra_defrtr)
 		goto skip_defrtr;
 
+	if (ipv6_chk_addr(dev_net(in6_dev->dev), &ipv6_hdr(skb)->saddr, NULL, 0))
+		goto skip_defrtr;
+
 	lifetime = ntohs(ra_msg->icmph.icmp6_rt_lifetime);
 
 #ifdef CONFIG_IPV6_ROUTER_PREF
@@ -1350,6 +1353,9 @@ skip_linkparms:
 		goto out;
 
 #ifdef CONFIG_IPV6_ROUTE_INFO
+	if (ipv6_chk_addr(dev_net(in6_dev->dev), &ipv6_hdr(skb)->saddr, NULL, 0))
+		goto skip_routeinfo;
+
 	if (in6_dev->cnf.accept_ra_rtr_pref && ndopts.nd_opts_ri) {
 		struct nd_opt_hdr *p;
 		for (p = ndopts.nd_opts_ri;
@@ -1367,6 +1373,8 @@ skip_linkparms:
 				      &ipv6_hdr(skb)->saddr);
 		}
 	}
+
+skip_routeinfo:
 #endif
 
 #ifdef CONFIG_IPV6_NDISC_NODETYPE
-- 
1.7.6.1

^ permalink raw reply related

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-10-23 15:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <1319380668.27507.19.camel@edumazet-laptop>

Sun, Oct 23, 2011 at 04:37:48PM CEST, eric.dumazet@gmail.com wrote:
>Le dimanche 23 octobre 2011 à 14:51 +0200, Jiri Pirko a écrit :
>
>> Yes. And team->mode_ops.receive can change only after synchronize_rcu is
>> done. It's not possible it changes within the window you are talking about.
>
>If it was true, you would not need the synchronize_rcu() call you added
>in __team_change_mode() :
>
>
>----------------------------------------------------------------------
>
>static int __team_change_mode(struct team *team,
>                             const struct team_mode *new_mode)
>{
>       /* Check if mode was previously set and do cleanup if so */
>       if (team->mode_kind) {
>               void (*exit_op)(struct team *team) = team->mode_ops.exit;
>
>               /* Clear ops area so no callback is called any longer */
>               team_mode_ops_clear(&team->mode_ops);
>
>               synchronize_rcu();
>
>               if (exit_op)
>                       exit_op(team);
>
>
>-----------------------------------------------------------------------
>
>So the question is : Why do you have this synchronize_rcu() call here ?

You are right. This call is redundant here. I'll remove it.
This also means I can use memset & memcpy for mode_ops after all.

I'll change that and add comment about this locking situation to avoid
confusion.

Thanks a lot Eric!

Jirka

>
>
>

^ permalink raw reply

* Re: [PATCH 05/10] RDMA/cxgb4: Add DB Overflow Avoidance.
From: Steve Wise @ 2011-10-23 15:33 UTC (permalink / raw)
  To: Vipul Pandya
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	roland-BHEL68pLQRGGvPXPguhicg, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	divy-ut6Up61K2wZBDgjK7y7TUQ, dm-ut6Up61K2wZBDgjK7y7TUQ,
	kumaras-ut6Up61K2wZBDgjK7y7TUQ
In-Reply-To: <1319044264-779-6-git-send-email-vipul-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>


On 10/19/2011 12:10 PM, Vipul Pandya wrote:
>          - get FULL/EMPTY/DROP events from LLD
>
>          - on FULL event, disable normal user mode DB rings.
>
>          - add modify_qp semantics to allow user processes to call into
>          the kernel to ring doobells without overflowing.
>
>          Add DB Full/Empty/Drop stats.
>
>          Mark queues when created indicating the doorbell state.
>
>          If we're in the middle of db overflow avoidance, then newly created
>          queues should start out in this mode.
>

Hey Vipul,

I just realized, we need to bump the ABI for iw_cxgb4 with this series 
so the user mode library can know if the driver supports the kernel mode 
db ringing. So bump C4IW_UVERBS_ABI_VERSION to 3.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next] bonding: fix wrong port enabling in 802.3ad
From: Kartik Chandran @ 2011-10-23 15:04 UTC (permalink / raw)
  To: netdev; +Cc: Sriram Chidambaram, Hugh Holbrook
In-Reply-To: <CAGMsbm4DMPfT6zSZp+fH2XB_gRUHR9FSRDxcwBJVCOhUi4j9AA@mail.gmail.com>

By a strange coincidence, we ran into this exact issue last week while
experimenting with the "standby" state, where we discovered that the
current implementation in bond_3ad does not pay attention to the Mux
SM state when deciding to enable a slave port for transmission, which
causes misbehavior in forwarding on the partner side.

Since the question was raised on the thread about what the 802.3ad
spec says in this respect, we can confidently answer that the spec
clearly mandates that the aggregator should include a port for
transmission only when it goes to DISTRIBUTING ( in independent
control  mode) and COLLECTING+DISTRIBUTING (in coupled mode).

We are testing this extensively right now, so I'm happy to try out
this patch and report back with findings.

-Kartik

^ permalink raw reply

* Re: [PATCH] ipv4: fix ipsec forward performance regression
From: Julian Anastasov @ 2011-10-23 14:52 UTC (permalink / raw)
  To: Yan, Zheng
  Cc: netdev@vger.kernel.org, davem@davemloft.net,
	eric.dumazet@gmail.com, Kim Phillips
In-Reply-To: <4EA3C91C.3090801@intel.com>


	Hello,

On Sun, 23 Oct 2011, Yan, Zheng wrote:

> There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable).
> It makes xfrm4_fill_dst() modify wrong data structure.
> 
> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
> ---
>  net/ipv4/xfrm4_policy.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
> index fc5368a..a0b4c5d 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -79,13 +79,13 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
>  	struct rtable *rt = (struct rtable *)xdst->route;
>  	const struct flowi4 *fl4 = &fl->u.ip4;
>  
> -	rt->rt_key_dst = fl4->daddr;
> -	rt->rt_key_src = fl4->saddr;
> -	rt->rt_key_tos = fl4->flowi4_tos;
> -	rt->rt_route_iif = fl4->flowi4_iif;
> -	rt->rt_iif = fl4->flowi4_iif;
> -	rt->rt_oif = fl4->flowi4_oif;
> -	rt->rt_mark = fl4->flowi4_mark;
> +	xdst->u.rt.rt_key_dst = fl4->daddr;
> +	xdst->u.rt.rt_key_src = fl4->saddr;
> +	xdst->u.rt.rt_key_tos = fl4->flowi4_tos;
> +	xdst->u.rt.rt_route_iif = fl4->flowi4_iif;

	May be I'm missing something but I don't see where
flowi4_iif is set for the forwarding case. __xfrm_route_forward
calls xfrm_decode_session which does not appear to set
flowi4_iif. When providing fl4 for output routes flowi4_iif
is always set to 0, so it represents rt_route_iif. But
then there are 2 variants for __ip_route_output_key:

- ip_route_output_slow sets flowi4_iif to loopback and
flowi4_oif to outdev during lookup but never restores them
to original values. It is assumed that caller uses outdev
from dst, not from flowi4_oif.

- for cached route we do not update flowi4_iif and flowi4_oif
in __ip_route_output_key, so the resulting fl4 can not be
used for these values. I assume, the current rules are that
only fl4.saddr and daddr are updated while flowi4_iif and
flowi4_oif are not. It looks wrong flowi code to rely on them.

	Currently, we have 3 values for devices:

rt_iif: indev for input routes, resulting outdev for output routes
which plays the role as indev for loopback traffic.

rt_oif: original outdev key, 0 for input routes, can be 0 for
output routes if socket is not bound to oif

rt_route_iif: indev for input routes, 0 for output routes

	With above rules for flowi4_iif and flowi4_oif
it is impossible to select value for rt_iif from fl4.

	I don't know the xfrm code well, may be after the
mentioned change we damaged rt_oif and rt_route_iif values
for cached dst which can lead to using slow path all the time.
Even if rt_intern_hash() avoids caching similar dsts multiple
times, if cached entry is damaged we will add more and
more new entries after every damage.

> +	xdst->u.rt.rt_iif = fl4->flowi4_iif;
> +	xdst->u.rt.rt_oif = fl4->flowi4_oif;
> +	xdst->u.rt.rt_mark = fl4->flowi4_mark;
>  
>  	xdst->u.dst.dev = dev;
>  	dev_hold(dev);

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Eric Dumazet @ 2011-10-23 14:37 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <20111023125101.GA20078@minipsycho.orion>

Le dimanche 23 octobre 2011 à 14:51 +0200, Jiri Pirko a écrit :

> Yes. And team->mode_ops.receive can change only after synchronize_rcu is
> done. It's not possible it changes within the window you are talking about.

If it was true, you would not need the synchronize_rcu() call you added
in __team_change_mode() :


----------------------------------------------------------------------

static int __team_change_mode(struct team *team,
                             const struct team_mode *new_mode)
{
       /* Check if mode was previously set and do cleanup if so */
       if (team->mode_kind) {
               void (*exit_op)(struct team *team) = team->mode_ops.exit;

               /* Clear ops area so no callback is called any longer */
               team_mode_ops_clear(&team->mode_ops);

               synchronize_rcu();

               if (exit_op)
                       exit_op(team);


-----------------------------------------------------------------------

So the question is : Why do you have this synchronize_rcu() call here ?

^ permalink raw reply

* [PATCH 03/16] mac_sonic: add irq resources and cleanup
From: Finn Thain @ 2011-10-23 14:11 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linux-m68k, David Miller, netdev
In-Reply-To: <20111023141108.856998818@telegraphics.com.au>

[-- Attachment #1: macsonic-platform-irq-rsrc --]
[-- Type: text/plain, Size: 6140 bytes --]

Make better use of the SONIC platform device by adding irq resources. This moves the via_type logic out of the NIC device driver which improves modularity.

Since interrupt handlers now run with CPU interrupts disabled, we don't need the macsonic_interrupt() wrapper. Remove it.

For consistency, rename retval as err.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>

Index: linux-m68k/arch/m68k/mac/config.c
===================================================================
--- linux-m68k.orig/arch/m68k/mac/config.c	2011-10-22 23:02:22.000000000 +1100
+++ linux-m68k/arch/m68k/mac/config.c	2011-10-23 00:51:11.000000000 +1100
@@ -936,9 +936,16 @@ static struct platform_device esp_1_pdev
 	.id		= 1,
 };
 
+static struct resource sonic_rsrcs[] = {
+	{ .flags = IORESOURCE_IRQ },
+	{ .flags = IORESOURCE_IRQ },
+};
+
 static struct platform_device sonic_pdev = {
 	.name		= "macsonic",
 	.id		= -1,
+	.num_resources  = ARRAY_SIZE(sonic_rsrcs),
+	.resource       = sonic_rsrcs,
 };
 
 static struct platform_device mace_pdev = {
@@ -1002,6 +1009,10 @@ int __init mac_platform_init(void)
 
 	switch (macintosh_config->ether_type) {
 	case MAC_ETHER_SONIC:
+		sonic_rsrcs[0].start = sonic_rsrcs[0].end = IRQ_NUBUS_9;
+		if (via_alt_mapping)
+			sonic_rsrcs[1].start = sonic_rsrcs[1].end = IRQ_AUTO_3;
+
 		platform_device_register(&sonic_pdev);
 		break;
 	case MAC_ETHER_MACE:
Index: linux-m68k/drivers/net/macsonic.c
===================================================================
--- linux-m68k.orig/drivers/net/macsonic.c	2011-10-22 23:02:38.000000000 +1100
+++ linux-m68k/drivers/net/macsonic.c	2011-10-22 23:02:38.000000000 +1100
@@ -60,7 +60,6 @@
 #include <asm/dma.h>
 #include <asm/macintosh.h>
 #include <asm/macints.h>
-#include <asm/mac_via.h>
 
 static char mac_sonic_string[] = "macsonic";
 
@@ -127,61 +126,49 @@ static inline void bit_reverse_addr(unsi
 		addr[i] = bitrev8(addr[i]);
 }
 
-static irqreturn_t macsonic_interrupt(int irq, void *dev_id)
-{
-	irqreturn_t result;
-	unsigned long flags;
-
-	local_irq_save(flags);
-	result = sonic_interrupt(irq, dev_id);
-	local_irq_restore(flags);
-	return result;
-}
-
 static int macsonic_open(struct net_device* dev)
 {
-	int retval;
+	struct sonic_local *lp = netdev_priv(dev);
+	int err;
 
-	retval = request_irq(dev->irq, sonic_interrupt, 0, "sonic", dev);
-	if (retval) {
-		printk(KERN_ERR "%s: unable to get IRQ %d.\n",
-				dev->name, dev->irq);
-		goto err;
-	}
-	/* Under the A/UX interrupt scheme, the onboard SONIC interrupt comes
-	 * in at priority level 3. However, we sometimes get the level 2 inter-
-	 * rupt as well, which must prevent re-entrance of the sonic handler.
-	 */
-	if (dev->irq == IRQ_AUTO_3) {
-		retval = request_irq(IRQ_NUBUS_9, macsonic_interrupt, 0,
-				     "sonic", dev);
-		if (retval) {
-			printk(KERN_ERR "%s: unable to get IRQ %d.\n",
-					dev->name, IRQ_NUBUS_9);
-			goto err_irq;
+	err = request_irq(dev->irq, sonic_interrupt, 0, "SONIC", dev);
+	if (err) {
+		pr_err("%s: unable to get IRQ %d\n", dev->name, dev->irq);
+		goto out;
+	}
+	if (lp->irq1) {
+		err = request_irq(lp->irq1, sonic_interrupt, 0, "SONIC", dev);
+		if (err) {
+			pr_err("%s: unable to get IRQ %d\n",
+			       dev->name, lp->irq1);
+			goto out_irq;
 		}
 	}
-	retval = sonic_open(dev);
-	if (retval)
-		goto err_irq_nubus;
+	err = sonic_open(dev);
+	if (err)
+		goto out_irq1;
+
 	return 0;
 
-err_irq_nubus:
-	if (dev->irq == IRQ_AUTO_3)
-		free_irq(IRQ_NUBUS_9, dev);
-err_irq:
+out_irq1:
+	if (lp->irq1)
+		free_irq(lp->irq1, dev);
+out_irq:
 	free_irq(dev->irq, dev);
-err:
-	return retval;
+out:
+	return err;
 }
 
 static int macsonic_close(struct net_device* dev)
 {
+	struct sonic_local *lp = netdev_priv(dev);
 	int err;
+
 	err = sonic_close(dev);
+
+	if (lp->irq1)
+		free_irq(lp->irq1, dev);
 	free_irq(dev->irq, dev);
-	if (dev->irq == IRQ_AUTO_3)
-		free_irq(IRQ_NUBUS_9, dev);
 	return err;
 }
 
@@ -310,8 +297,9 @@ static void __devinit mac_onboard_sonic_
 	random_ether_addr(dev->dev_addr);
 }
 
-static int __devinit mac_onboard_sonic_probe(struct net_device *dev)
+static int __devinit mac_onboard_sonic_probe(struct platform_device *pdev)
 {
+	struct net_device *dev = platform_get_drvdata(pdev);
 	struct sonic_local* lp = netdev_priv(dev);
 	int sr;
 	int commslot = 0;
@@ -348,10 +336,12 @@ static int __devinit mac_onboard_sonic_p
 	/* Danger!  My arms are flailing wildly!  You *must* set lp->reg_offset
 	 * and dev->base_addr before using SONIC_READ() or SONIC_WRITE() */
 	dev->base_addr = ONBOARD_SONIC_REGISTERS;
-	if (via_alt_mapping)
-		dev->irq = IRQ_AUTO_3;
-	else
-		dev->irq = IRQ_NUBUS_9;
+
+	dev->irq = platform_get_irq(pdev, 0);
+	lp->irq1 = platform_get_irq(pdev, 1);
+
+	if (!dev->irq)
+		return -ENODEV;
 
 	if (!sonic_version_printed) {
 		printk(KERN_INFO "%s", version);
@@ -590,7 +580,7 @@ static int __devinit mac_sonic_probe(str
 	platform_set_drvdata(pdev, dev);
 
 	/* This will catch fatal stuff like -ENOMEM as well as success */
-	err = mac_onboard_sonic_probe(dev);
+	err = mac_onboard_sonic_probe(pdev);
 	if (err == 0)
 		goto found;
 	if (err != -ENODEV)
Index: linux-m68k/drivers/net/sonic.h
===================================================================
--- linux-m68k.orig/drivers/net/sonic.h	2011-10-22 23:02:22.000000000 +1100
+++ linux-m68k/drivers/net/sonic.h	2011-10-22 23:02:38.000000000 +1100
@@ -318,6 +318,9 @@ struct sonic_local {
 	unsigned int eol_rx;
 	unsigned int eol_tx;           /* last unacked transmit packet */
 	unsigned int next_tx;          /* next free TD */
+#ifdef CONFIG_MAC
+	int irq1;                      /* Second IRQ for Mac Quadras */
+#endif
 	struct device *device;         /* generic device */
 	struct net_device_stats stats;
 };
Index: linux-m68k/arch/m68k/mac/via.c
===================================================================
--- linux-m68k.orig/arch/m68k/mac/via.c	2011-10-22 23:02:38.000000000 +1100
+++ linux-m68k/arch/m68k/mac/via.c	2011-10-23 00:51:10.000000000 +1100
@@ -40,7 +40,6 @@
 volatile __u8 *via1, *via2;
 int rbv_present;
 int via_alt_mapping;
-EXPORT_SYMBOL(via_alt_mapping);
 static __u8 rbv_clear;
 
 /*

^ permalink raw reply

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-10-23 12:51 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <1319366709.27507.14.camel@edumazet-laptop>

Sun, Oct 23, 2011 at 12:45:09PM CEST, eric.dumazet@gmail.com wrote:
>Le dimanche 23 octobre 2011 à 10:25 +0200, Jiri Pirko a écrit :
>
>> Please forgive me, it's possible I'm missing something. But I see no way how
>> team->mode_ops.receive can change during team_handle_frame (holding
>> rcu_read_lock) for the reason I stated earlier.
>> 
>> team_port_del() calls netdev_rx_handler_unregister() and after that it
>> calls synchronize_rcu(). That ensures that by the finish of team_port_del()
>> run, team_handle_frame() is not called for this port anymore.
>> 
>> And this combined with "if (!list_empty(&team->port_list))" check in
>> team_change_mode() ensures safety.
>> 
>> Of course team_port_del() and team_change_mode() are both protected by
>> team->lock so they are mutually excluded.
>
>Oh well.
>
>Jirka, I believe I do understand how RCU is working ;)

With this I do not argue :)
But note that team->mode_ops.receive is not rcu-protected pointer per
say (no rcu_assign, rcu_dereference).

>
>There is an obvious problem I pointed to you, but you persist leaving
>this potential bug.
>
>After netdev_rx_handler_unregister(), you can still have other cpus
>calling your handler and reading/using previous memory content. Only
>after synchronize_rcu() you can be safe. But in your patch the bug
>window is _exactly_ _before_ synchronize_rcu() returns.

Yes. And team->mode_ops.receive can change only after synchronize_rcu is
done. It's not possible it changes within the window you are talking about.

>
>Your spinlock wont help you at all, since readers dont take it.
>Spinlock only protects writers.

Sure. team->lock ensures that team_change_mode() is not called withing
the "bug window" but only after team_port_del() finishes.

>
>
>So a reader, even holding rcu lock, can really see two different
>mode_ops.receive values for the :
>
>if (team->mode_ops.receive)
>	res = team->mode_ops.receive(team, port, skb);
>
>
>rcu_lock() doesnt mean the reader can see an unique .receive value,	

I'm very well aware of this.

>
>I am afraid you misunderstood the point.
>
>Real point of RCU here is that the _writer_ wont returns from
>synchronize_rcu() if at least one reader is still running the handler.
>
>No problem with me, I'll just post a patch later, I just cant Ack your
>work as is.
>
>
>

^ permalink raw reply

* Re: [PATCH 00/28 v6] m68k: Convert to genirq
From: Geert Uytterhoeven @ 2011-10-23 12:12 UTC (permalink / raw)
  To: Greg Ungerer; +Cc: linux-m68k, Thomas Gleixner, linux-kernel, netdev
In-Reply-To: <4EA3FA86.7000905@snapgear.com>

On Sun, Oct 23, 2011 at 13:29, Greg Ungerer <gerg@snapgear.com> wrote:
> On 10/23/2011 07:49 PM, Geert Uytterhoeven wrote:
>> On Thu, Oct 20, 2011 at 14:18, Geert Uytterhoeven<geert@linux-m68k.org>
>>  wrote:
>>> On Sun, Sep 11, 2011 at 13:59, Geert Uytterhoeven<geert@linux-m68k.org>
>>>  wrote:
>>>> This patch series converts the m68k/mmu (nommu was converted before)
>>>> architecture to the generic hardirq framework.
>>>>
>>>> á- [01/28] genirq: Add missing "else" in irq_shutdown()
>>>> á- [02/28] ide-{cd,floppy,tape}: Do not include<linux/irq.>
>>>> á- [03/28] keyboard: Do not include<linux/irq.>
>>>> á- [04/28] m68k/irq: Rename irq_controller to irq_chip
>>>> á- [05/28] m68k/irq: Kill irq_node_t typedef, always use struct
>>>> irq_node
>>>> á- [06/28] m68k/irq: Rename irq_node to irq_data
>>>> á- [07/28] m68k/irq: Switch irq_chip methods to "struct irq_data *data"
>>>> á- [08/28] m68k/irq: Rename setup_irq() to m68k_setup_irq() and make it
>>>> static
>>>> á- [09/28] m68k/irq: Extract irq_set_chip()
>>>> á- [10/28] m68k/irq: Add m68k_setup_irq_controller()
>>>> á- [11/28] m68k/irq: Rename {,__}m68k_handle_int()
>>>> á- [12/28] m68k/irq: Remove obsolete IRQ_FLG_* definitions and users
>>>> á- [13/28] m68k/irq: Add genirq support
>>>> á- [14/28] m68k/atari: Convert Atari to genirq
>>>> á- [15/28] m68k/atari: Remove code and comments about different irq
>>>> types
>>>> á- [16/28] m68k/amiga: Refactor amiints.c
>>>> á- [17/28] m68k/amiga: Convert Amiga to genirq
>>>> á- [18/28] m68k/amiga: Optimize interrupts using chain handlers
>>>> á- [19/28] m68k/mac: Convert Mac to genirq
>>>> á- [20/28] m68k/mac: Optimize interrupts using chain handlers
>>>> á- [21/28] m68k/hp300: Convert HP9000/300 and HP9000/400 to genirq
>>>> á- [22/28] m68k/vme: Convert VME to genirq
>>>> á- [23/28] m68k/apollo: Convert Apollo to genirq
>>>> á- [24/28] m68k/sun3: Use the kstat_irqs_cpu() wrapper
>>>> á- [25/28] m68k/sun3: Convert Sun3/3x to genirq
>>>> á- [26/28] m68k/q40: Convert Q40/Q60 to genirq
>>>> á- [27/28] m68k/irq: Remove obsolete m68k irq framework
>>>> á- [28/28] m68k/irq: Remove obsolete support for user vector interrupt
>>>> fixups
>>>>
>>>> Overview:
>>>> á- [01] is a fix for the core genirq code,
>>>
>>> This went into v3.1-rc6.
>>>
>>>> á- [02-03] are fixes to avoid compile problems later in the conversion
>>>> á áprocess,
>>>
>>> The keyboard path went into the tty -next tree.
>>> The IDE one is still pending (I've just resent it).
>>
>> The IDE one got acked in the mean time.
>>
>>>> I will update my m68k-genirq branch as soon as master.kernel.org is
>>>> available
>>>> again.
>>>
>>> Updated, on top of m68k master (which is at v3.1-rc10 now).
>>>
>>> http://git.kernel.org/?p=linux/kernel/git/geert/linux-m68k.git;a=shortlog;h=refs/heads/m68k-genirq
>>>
>>> If noone objects, I'd like to add this to the m68k master and for-3.2
>>> branches.
>>
>> I added it to m68k master.
>>
>> As there were several merge conflicts with current -next
>> (arch/m68k/kernel/Makefile_mm
>> due to the mmu/nommu merge, and drivers/net/macsonic.c in [12/28] due
>> to the network
>> driver reshuffling dance), I did not add it to for-3.2 and for-next,
>> but to for-3.3.
>>
>> Depending on Stephen's return during or after the merge window, and
>> the merge timing
>> of the m68knommu and netdev trees, I may stil try to sneak it in 3.2,
>> though.
>
> I can ask Linus to pull the m68knommu tree early in the merge window,
> if that will help.

Thanks, that will help. Then I can rebase on top of that.

And I'll split "[12/28] m68k/irq: Remove obsolete IRQ_FLG_*
definitions and users",
so the net/scsi/tty parts can hit the repective trees separately.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH 00/28 v6] m68k: Convert to genirq
From: Greg Ungerer @ 2011-10-23 11:29 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linux-m68k, Thomas Gleixner, linux-kernel, netdev
In-Reply-To: <CAMuHMdX7C5NU5_MO5GdO=Pf2X59B3vFo4Y9B5wg-NQzwVbbLZA@mail.gmail.com>

Hi Geert,

On 10/23/2011 07:49 PM, Geert Uytterhoeven wrote:
> On Thu, Oct 20, 2011 at 14:18, Geert Uytterhoeven<geert@linux-m68k.org>  wrote:
>> On Sun, Sep 11, 2011 at 13:59, Geert Uytterhoeven<geert@linux-m68k.org>  wrote:
>>> This patch series converts the m68k/mmu (nommu was converted before)
>>> architecture to the generic hardirq framework.
>>>
>>> á- [01/28] genirq: Add missing "else" in irq_shutdown()
>>> á- [02/28] ide-{cd,floppy,tape}: Do not include<linux/irq.>
>>> á- [03/28] keyboard: Do not include<linux/irq.>
>>> á- [04/28] m68k/irq: Rename irq_controller to irq_chip
>>> á- [05/28] m68k/irq: Kill irq_node_t typedef, always use struct irq_node
>>> á- [06/28] m68k/irq: Rename irq_node to irq_data
>>> á- [07/28] m68k/irq: Switch irq_chip methods to "struct irq_data *data"
>>> á- [08/28] m68k/irq: Rename setup_irq() to m68k_setup_irq() and make it static
>>> á- [09/28] m68k/irq: Extract irq_set_chip()
>>> á- [10/28] m68k/irq: Add m68k_setup_irq_controller()
>>> á- [11/28] m68k/irq: Rename {,__}m68k_handle_int()
>>> á- [12/28] m68k/irq: Remove obsolete IRQ_FLG_* definitions and users
>>> á- [13/28] m68k/irq: Add genirq support
>>> á- [14/28] m68k/atari: Convert Atari to genirq
>>> á- [15/28] m68k/atari: Remove code and comments about different irq types
>>> á- [16/28] m68k/amiga: Refactor amiints.c
>>> á- [17/28] m68k/amiga: Convert Amiga to genirq
>>> á- [18/28] m68k/amiga: Optimize interrupts using chain handlers
>>> á- [19/28] m68k/mac: Convert Mac to genirq
>>> á- [20/28] m68k/mac: Optimize interrupts using chain handlers
>>> á- [21/28] m68k/hp300: Convert HP9000/300 and HP9000/400 to genirq
>>> á- [22/28] m68k/vme: Convert VME to genirq
>>> á- [23/28] m68k/apollo: Convert Apollo to genirq
>>> á- [24/28] m68k/sun3: Use the kstat_irqs_cpu() wrapper
>>> á- [25/28] m68k/sun3: Convert Sun3/3x to genirq
>>> á- [26/28] m68k/q40: Convert Q40/Q60 to genirq
>>> á- [27/28] m68k/irq: Remove obsolete m68k irq framework
>>> á- [28/28] m68k/irq: Remove obsolete support for user vector interrupt fixups
>>>
>>> Overview:
>>> á- [01] is a fix for the core genirq code,
>>
>> This went into v3.1-rc6.
>>
>>> á- [02-03] are fixes to avoid compile problems later in the conversion
>>> á áprocess,
>>
>> The keyboard path went into the tty -next tree.
>> The IDE one is still pending (I've just resent it).
>
> The IDE one got acked in the mean time.
>
>>> I will update my m68k-genirq branch as soon as master.kernel.org is available
>>> again.
>>
>> Updated, on top of m68k master (which is at v3.1-rc10 now).
>> http://git.kernel.org/?p=linux/kernel/git/geert/linux-m68k.git;a=shortlog;h=refs/heads/m68k-genirq
>>
>> If noone objects, I'd like to add this to the m68k master and for-3.2 branches.
>
> I added it to m68k master.
>
> As there were several merge conflicts with current -next
> (arch/m68k/kernel/Makefile_mm
> due to the mmu/nommu merge, and drivers/net/macsonic.c in [12/28] due
> to the network
> driver reshuffling dance), I did not add it to for-3.2 and for-next,
> but to for-3.3.
>
> Depending on Stephen's return during or after the merge window, and
> the merge timing
> of the m68knommu and netdev trees, I may stil try to sneak it in 3.2, though.

I can ask Linus to pull the m68knommu tree early in the merge window,
if that will help.

Regards
Greg


------------------------------------------------------------------------
Greg Ungerer  --  Principal Engineer        EMAIL:     gerg@snapgear.com
SnapGear Group, McAfee                      PHONE:       +61 7 3435 2888
8 Gardner Close,                            FAX:         +61 7 3891 3630
Milton, QLD, 4064, Australia                WEB: http://www.SnapGear.com

^ permalink raw reply

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Eric Dumazet @ 2011-10-23 10:45 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <20111023082545.GA15908@minipsycho.orion>

Le dimanche 23 octobre 2011 à 10:25 +0200, Jiri Pirko a écrit :

> Please forgive me, it's possible I'm missing something. But I see no way how
> team->mode_ops.receive can change during team_handle_frame (holding
> rcu_read_lock) for the reason I stated earlier.
> 
> team_port_del() calls netdev_rx_handler_unregister() and after that it
> calls synchronize_rcu(). That ensures that by the finish of team_port_del()
> run, team_handle_frame() is not called for this port anymore.
> 
> And this combined with "if (!list_empty(&team->port_list))" check in
> team_change_mode() ensures safety.
> 
> Of course team_port_del() and team_change_mode() are both protected by
> team->lock so they are mutually excluded.

Oh well.

Jirka, I believe I do understand how RCU is working ;)

There is an obvious problem I pointed to you, but you persist leaving
this potential bug.

After netdev_rx_handler_unregister(), you can still have other cpus
calling your handler and reading/using previous memory content. Only
after synchronize_rcu() you can be safe. But in your patch the bug
window is _exactly_ _before_ synchronize_rcu() returns.

Your spinlock wont help you at all, since readers dont take it.
Spinlock only protects writers.


So a reader, even holding rcu lock, can really see two different
mode_ops.receive values for the :

if (team->mode_ops.receive)
	res = team->mode_ops.receive(team, port, skb);


rcu_lock() doesnt mean the reader can see an unique .receive value,

I am afraid you misunderstood the point.

Real point of RCU here is that the _writer_ wont returns from
synchronize_rcu() if at least one reader is still running the handler.

No problem with me, I'll just post a patch later, I just cant Ack your
work as is.

^ permalink raw reply

* Re: [PATCH 00/28 v6] m68k: Convert to genirq
From: Geert Uytterhoeven @ 2011-10-23  9:49 UTC (permalink / raw)
  To: linux-m68k, Greg Ungerer, Thomas Gleixner; +Cc: linux-kernel, netdev
In-Reply-To: <CAMuHMdWOhKcoDoyTnKJzruZ=gpV5uKNFpXnGCkrac7oEkCihJw@mail.gmail.com>

On Thu, Oct 20, 2011 at 14:18, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Sun, Sep 11, 2011 at 13:59, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> This patch series converts the m68k/mmu (nommu was converted before)
>> architecture to the generic hardirq framework.
>>
>>  - [01/28] genirq: Add missing "else" in irq_shutdown()
>>  - [02/28] ide-{cd,floppy,tape}: Do not include <linux/irq.>
>>  - [03/28] keyboard: Do not include <linux/irq.>
>>  - [04/28] m68k/irq: Rename irq_controller to irq_chip
>>  - [05/28] m68k/irq: Kill irq_node_t typedef, always use struct irq_node
>>  - [06/28] m68k/irq: Rename irq_node to irq_data
>>  - [07/28] m68k/irq: Switch irq_chip methods to "struct irq_data *data"
>>  - [08/28] m68k/irq: Rename setup_irq() to m68k_setup_irq() and make it static
>>  - [09/28] m68k/irq: Extract irq_set_chip()
>>  - [10/28] m68k/irq: Add m68k_setup_irq_controller()
>>  - [11/28] m68k/irq: Rename {,__}m68k_handle_int()
>>  - [12/28] m68k/irq: Remove obsolete IRQ_FLG_* definitions and users
>>  - [13/28] m68k/irq: Add genirq support
>>  - [14/28] m68k/atari: Convert Atari to genirq
>>  - [15/28] m68k/atari: Remove code and comments about different irq types
>>  - [16/28] m68k/amiga: Refactor amiints.c
>>  - [17/28] m68k/amiga: Convert Amiga to genirq
>>  - [18/28] m68k/amiga: Optimize interrupts using chain handlers
>>  - [19/28] m68k/mac: Convert Mac to genirq
>>  - [20/28] m68k/mac: Optimize interrupts using chain handlers
>>  - [21/28] m68k/hp300: Convert HP9000/300 and HP9000/400 to genirq
>>  - [22/28] m68k/vme: Convert VME to genirq
>>  - [23/28] m68k/apollo: Convert Apollo to genirq
>>  - [24/28] m68k/sun3: Use the kstat_irqs_cpu() wrapper
>>  - [25/28] m68k/sun3: Convert Sun3/3x to genirq
>>  - [26/28] m68k/q40: Convert Q40/Q60 to genirq
>>  - [27/28] m68k/irq: Remove obsolete m68k irq framework
>>  - [28/28] m68k/irq: Remove obsolete support for user vector interrupt fixups
>>
>> Overview:
>>  - [01] is a fix for the core genirq code,
>
> This went into v3.1-rc6.
>
>>  - [02-03] are fixes to avoid compile problems later in the conversion
>>    process,
>
> The keyboard path went into the tty -next tree.
> The IDE one is still pending (I've just resent it).

The IDE one got acked in the mean time.

>> I will update my m68k-genirq branch as soon as master.kernel.org is available
>> again.
>
> Updated, on top of m68k master (which is at v3.1-rc10 now).
> http://git.kernel.org/?p=linux/kernel/git/geert/linux-m68k.git;a=shortlog;h=refs/heads/m68k-genirq
>
> If noone objects, I'd like to add this to the m68k master and for-3.2 branches.

I added it to m68k master.

As there were several merge conflicts with current -next
(arch/m68k/kernel/Makefile_mm
due to the mmu/nommu merge, and drivers/net/macsonic.c in [12/28] due
to the network
driver reshuffling dance), I did not add it to for-3.2 and for-next,
but to for-3.3.

Depending on Stephen's return during or after the merge window, and
the merge timing
of the m68knommu and netdev trees, I may stil try to sneak it in 3.2, though.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] ipv4: fix ipsec forward performance regression
From: Eric Dumazet @ 2011-10-23  9:03 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: netdev@vger.kernel.org, davem@davemloft.net, Kim Phillips
In-Reply-To: <4EA3C91C.3090801@intel.com>

Le dimanche 23 octobre 2011 à 15:58 +0800, Yan, Zheng a écrit :
> There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable).
> It makes xfrm4_fill_dst() modify wrong data structure.
> 
> Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
> ---
>  net/ipv4/xfrm4_policy.c |   14 +++++++-------
>  1 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
> index fc5368a..a0b4c5d 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -79,13 +79,13 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
>  	struct rtable *rt = (struct rtable *)xdst->route;
>  	const struct flowi4 *fl4 = &fl->u.ip4;
>  
> -	rt->rt_key_dst = fl4->daddr;
> -	rt->rt_key_src = fl4->saddr;
> -	rt->rt_key_tos = fl4->flowi4_tos;
> -	rt->rt_route_iif = fl4->flowi4_iif;
> -	rt->rt_iif = fl4->flowi4_iif;
> -	rt->rt_oif = fl4->flowi4_oif;
> -	rt->rt_mark = fl4->flowi4_mark;
> +	xdst->u.rt.rt_key_dst = fl4->daddr;
> +	xdst->u.rt.rt_key_src = fl4->saddr;
> +	xdst->u.rt.rt_key_tos = fl4->flowi4_tos;
> +	xdst->u.rt.rt_route_iif = fl4->flowi4_iif;
> +	xdst->u.rt.rt_iif = fl4->flowi4_iif;
> +	xdst->u.rt.rt_oif = fl4->flowi4_oif;
> +	xdst->u.rt.rt_mark = fl4->flowi4_mark;
>  
>  	xdst->u.dst.dev = dev;
>  	dev_hold(dev);

Good catch, thanks !

Reported-by: Kim Phillips <kim.phillips@freescale.com>

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

^ permalink raw reply

* Re: [patch net-next V2] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-10-23  8:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka
In-Reply-To: <1319359437.6180.73.camel@edumazet-laptop>

Sun, Oct 23, 2011 at 10:43:57AM CEST, eric.dumazet@gmail.com wrote:
>Le dimanche 23 octobre 2011 à 10:25 +0200, Jiri Pirko a écrit :
>> Sat, Oct 22, 2011 at 06:51:22PM CEST, eric.dumazet@gmail.com wrote:
>> >Le samedi 22 octobre 2011 à 17:13 +0200, Jiri Pirko a écrit :
>> >> >> +
>> >> >> +/************************
>> >> >> + * Rx path frame handler
>> >> >> + ************************/
>> >> >> +
>> >> >> +/* note: already called with rcu_read_lock */
>> >> >> +static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
>> >> >> +{
>> >> >> +	struct sk_buff *skb = *pskb;
>> >> >> +	struct team_port *port;
>> >> >> +	struct team *team;
>> >> >> +	rx_handler_result_t res = RX_HANDLER_ANOTHER;
>> >> >> +
>> >> >> +	skb = skb_share_check(skb, GFP_ATOMIC);
>> >> >> +	if (!skb)
>> >> >> +		return RX_HANDLER_CONSUMED;
>> >> >> +
>> >> >> +	*pskb = skb;
>> >> >> +
>> >> >> +	port = team_port_get_rcu(skb->dev);
>> >> >> +	team = port->team;
>> >> >> +
>> >> >> +	if (team->mode_ops.receive)
>> >> >
>> >> >Hmm, you need ACCESS_ONCE() here or rcu_dereference()
>> >> >
>> >> >See commit 4d97480b1806e883eb (bonding: use local function pointer of
>> >> >bond->recv_probe in bond_handle_frame) for reference
>> >> 
>> >> I do not think so. Because mode_ops.receive changes only from
>> >> __team_change_mode() and this can be called only in case no ports are in
>> >> team. And team_port_del() calls synchronize_rcu().
>> >> 
>> >
>> >
>> >
>> >We are used to code following this template :
>> >
>> >if (ops->handler)
>> >	ops->handler(arguments);
>> >
>> >But this is valid only because ops points to constant memory.
>> >
>> >
>> >In your case, we really see its not true, dont try to pretend its safe.
>> 
>> Please forgive me, it's possible I'm missing something. But I see no way how
>> team->mode_ops.receive can change during team_handle_frame (holding
>> rcu_read_lock) for the reason I stated earlier.
>> 
>> team_port_del() calls netdev_rx_handler_unregister() and after that it
>> calls synchronize_rcu(). That ensures that by the finish of team_port_del()
>> run, team_handle_frame() is not called for this port anymore.
>> 
>> And this combined with "if (!list_empty(&team->port_list))" check in
>> team_change_mode() ensures safety.
>> 
>> Of course team_port_del() and team_change_mode() are both protected by
>> team->lock so they are mutually excluded.
>
>Then, why even testing (team->mode_ops.receive) being NULL at the first
>place, if you are sure no packets can flight meeting this NULL pointer ?
>
>Something is flawed in the logic.

It's not :) The test is simply because a mode may not implement this
callback (actually "roundrobin" mode doesn't have this implemented).

Jirka
>
>
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox