Netdev List
 help / color / mirror / Atom feed
* GoodDay
From: Mr Jack Parkinson @ 2011-11-13  2:08 UTC (permalink / raw)


GoodDay

My Name is Mr Jack Parkinson, I work at the Euro lottery company and i want
to seek for your opinion on making you win our ongoing lottery draw and if
i do, We both will be sharing the winning funds 50-50%. This is for real
and if interested contact me privately via my email: mr.jackp002@fengv.com

Mr Jack Parkinson.




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

^ permalink raw reply

* GoodDay
From: Mr Jack Parkinson @ 2011-11-13  1:08 UTC (permalink / raw)


GoodDay

My Name is Mr Jack Parkinson, I work at the Euro lottery company and i want
to seek for your opinion on making you win our ongoing lottery draw and if
i do, We both will be sharing the winning funds 50-50%. This is for real
and if interested contact me privately via my email: mr.jackp002@fengv.com

Mr Jack Parkinson.




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

^ permalink raw reply

* Re: send/receive data via eth0/1 external loop ?
From: Roland Kletzing @ 2011-11-12 23:15 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev

to confirm: IT WORKS! (tested with 2.6.38 kernel)  :)

more detail:
>> You can use accept_local flag and rule with pref 0:
>>
>>http://marc.info/?l=linux-netdev&m=125983955627148&w=2

this in itself is not enought to get it running, but with this one
http://marc.info/?l=linux-netdev&m=129500124207256&w=2
(adapted for eth0/1), it looks good  and i see packets flow between eth0/1

thanks again for the pointer !

regards
Roland


-----Ursprüngliche Nachricht-----
Von: "Roland Kletzing" <devzero@web.de>
Gesendet: 12.11.2011 19:50:33
An: "Julian Anastasov" <ja@ssi.bg>
Betreff: Re: send/receive data via eth0/1 external loop ?

>great, thanks!
>
>
>-----Ursprüngliche Nachricht-----
>Von: "Julian Anastasov" <ja@ssi.bg>
>Gesendet: 12.11.2011 19:31:13
>An: "Roland Kletzing" <devzero@web.de>
>Betreff: Re: send/receive data via eth0/1 external loop ?
>
>>
>> Hello,
>>
>>On Sat, 12 Nov 2011, Roland Kletzing wrote:
>>
>>> Hello,
>>> i`m searching for a while now, but i did not find a solution, so forgive when
>>> i`m (as a user) asking for expert help here.
>>>
>>> I want to do some network stress/burn-in and quality/reliability testing,
>>> i.e. i want to send/receive large amounts of data trough the systems network
>>> interfaces without the need of hogging the net or other systems.
>>>
>>> so, that test should run "loopback", i.e. connecting eth0+1 via crossover cable
>>> or via locally attached switch. I don`t want to use a second system for that.
>>>
>>> testing at ip/tcp level would be favourable , but raw eth level would be also fine.
>>>
>>> But how ?
>>> What appears simple on a first view, seems to be difficult.... :(
>>>
>>> All i came across is this report:
>>> http://www.zyztematik.com/?p=10
>>> and this description of a patch:
>>> http://www.ssi.bg/~ja/send-to-self.txt
>>>
>>> Does somebody know a proper way of achieving this without patching the
>>> kernel ?
>>
>> You can use accept_local flag and rule with pref 0:
>>
>>http://marc.info/?l=linux-netdev&m=125983955627148&w=2
>>
>>Regards
>>
>>--
>>Julian Anastasov <ja@ssi.bg>
>


___________________________________________________________
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192

^ permalink raw reply

* Re: [PATCH] ah: Don't return NET_XMIT_DROP on input.
From: David Miller @ 2011-11-12 23:14 UTC (permalink / raw)
  To: nbowler; +Cc: netdev, linux-kernel
In-Reply-To: <1320951687-21491-1-git-send-email-nbowler@elliptictech.com>

From: Nick Bowler <nbowler@elliptictech.com>
Date: Thu, 10 Nov 2011 14:01:27 -0500

> When the ahash driver returns -EBUSY, AH4/6 input functions return
> NET_XMIT_DROP, presumably copied from the output code path.  But
> returning transmit codes on input doesn't make a lot of sense.
> Since NET_XMIT_DROP is a positive int, this gets interpreted as
> the next header type (i.e., success).  As that can only end badly,
> remove the check.
> 
> Signed-off-by: Nick Bowler <nbowler@elliptictech.com>

Applied and queued up for -stable, thanks Nick.

^ permalink raw reply

* Re: [PATCH net-next 0/4] be2net fixes
From: David Miller @ 2011-11-12 22:59 UTC (permalink / raw)
  To: sathya.perla; +Cc: netdev
In-Reply-To: <1320988680-7842-1-git-send-email-sathya.perla@emulex.com>

From: Sathya Perla <sathya.perla@emulex.com>
Date: Fri, 11 Nov 2011 10:47:56 +0530

> Pls apply. Thanks.
> 
> Sathya Perla (4):
>   be2net: init (vf)_if_handle/vf_pmac_id to handle failure scenarios
>   be2net: stop checking the UE registers after an EEH error
>   be2net: don't log more than one error on detecting EEH/UE errors
>   be2net: stop issuing FW cmds if any cmd times out

All applied, thank you.

^ permalink raw reply

* Re: [PATCH 1/2] net: vlan: 802.1ad S-VLAN support
From: David Miller @ 2011-11-12 22:22 UTC (permalink / raw)
  To: equinox; +Cc: netdev, kaber, mirqus
In-Reply-To: <20111112141429.GA41984@jupiter.n2.diac24.net>

From: David Lamparter <equinox@diac24.net>
Date: Sat, 12 Nov 2011 15:14:29 +0100

> Well, hmm... Suggestions?

The feature is important to you, therefore you're the one who needs
to figure out how to avoid the memory bloat.

And no, config options aren't how to do this, every distribution
(and therefore %99 of users) will have it enabled.

^ permalink raw reply

* Re: send/receive data via eth0/1 external loop ?
From: Roland Kletzing @ 2011-11-12 18:50 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev

great, thanks!


-----Ursprüngliche Nachricht-----
Von: "Julian Anastasov" <ja@ssi.bg>
Gesendet: 12.11.2011 19:31:13
An: "Roland Kletzing" <devzero@web.de>
Betreff: Re: send/receive data via eth0/1 external loop ?

>
> Hello,
>
>On Sat, 12 Nov 2011, Roland Kletzing wrote:
>
>> Hello,
>> i`m searching for a while now, but i did not find a solution, so forgive when
>> i`m (as a user) asking for expert help here.
>>
>> I want to do some network stress/burn-in and quality/reliability testing,
>> i.e. i want to send/receive large amounts of data trough the systems network
>> interfaces without the need of hogging the net or other systems.
>>
>> so, that test should run "loopback", i.e. connecting eth0+1 via crossover cable
>> or via locally attached switch. I don`t want to use a second system for that.
>>
>> testing at ip/tcp level would be favourable , but raw eth level would be also fine.
>>
>> But how ?
>> What appears simple on a first view, seems to be difficult.... :(
>>
>> All i came across is this report:
>> http://www.zyztematik.com/?p=10
>> and this description of a patch:
>> http://www.ssi.bg/~ja/send-to-self.txt
>>
>> Does somebody know a proper way of achieving this without patching the
>> kernel ?
>
> You can use accept_local flag and rule with pref 0:
>
>http://marc.info/?l=linux-netdev&m=125983955627148&w=2
>
>Regards
>
>--
>Julian Anastasov <ja@ssi.bg>


___________________________________________________________
SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192

^ permalink raw reply

* Re: send/receive data via eth0/1 external loop ?
From: Julian Anastasov @ 2011-11-12 18:31 UTC (permalink / raw)
  To: Roland Kletzing; +Cc: netdev
In-Reply-To: <104297899.1282710.1321058833510.JavaMail.fmail@mwmweb017>


	Hello,

On Sat, 12 Nov 2011, Roland Kletzing wrote:

> Hello,
> i`m searching for a while now, but i did not find a solution, so forgive when
> i`m (as a user) asking for expert help here.
> 
> I want to do some network stress/burn-in and quality/reliability testing,
> i.e. i want to send/receive large amounts of data trough the systems network
> interfaces without the need of hogging the net or other systems.
> 
> so, that test should run "loopback", i.e. connecting eth0+1 via crossover cable
> or via locally attached switch. I don`t want to use a second system for that.
> 
> testing at ip/tcp level would be favourable , but raw eth level would be also fine.
> 
> But how ?
> What appears simple on a first view, seems to be difficult.... :(
> 
> All i came across is this report:
> http://www.zyztematik.com/?p=10
> and this description of a patch:
> http://www.ssi.bg/~ja/send-to-self.txt
> 
> Does somebody know a proper way of achieving this without patching the
> kernel ?

	You can use accept_local flag and rule with pref 0:

http://marc.info/?l=linux-netdev&m=125983955627148&w=2

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH 1/2] net: vlan: 802.1ad S-VLAN support
From: Michał Mirosław @ 2011-11-12 16:06 UTC (permalink / raw)
  To: David Lamparter; +Cc: David Miller, netdev, kaber
In-Reply-To: <20111112141429.GA41984@jupiter.n2.diac24.net>

2011/11/12 David Lamparter <equinox@diac24.net>:
> On Fri, Nov 11, 2011 at 08:22:35PM -0500, David Miller wrote:
>> > @@ -87,7 +97,8 @@ struct vlan_group {
>> >                                         */
>> >     unsigned int            nr_vlans;
>> >     struct hlist_node       hlist;  /* linked list */
>> > -   struct net_device **vlan_devices_arrays[VLAN_GROUP_ARRAY_SPLIT_PARTS];
>> > +   struct net_device **vlan_devices_arrays[VLAN_N_PROTOCOL]
>> > +                                           [VLAN_GROUP_ARRAY_SPLIT_PARTS];
>> >     struct rcu_head         rcu;
>> >  };
>>
>> This is a terrible waste of memory.  You're now using 5 times as much space,
>> the vast majority of which will be entirely unused.
>
> VLAN_GROUP_ARRAY_SPLIT_PARTS is 8; so the memory consumption of this was
> previously 8 * ptr = 64 bytes and is now 5 * 8 * ptr = 320 bytes. I thought
> those extra 256 bytes per VLAN-carrying master device are worth the
> simplicity, especially since this saves me impacts on 802.1Q C-VLAN
> lookup performance elsewhere.

I misread this as a table of all VLANs. It doesn't look that terrible
now compared to old scheme. In the case when you have only a handful
of VLANs configured you waste a 4kB per VLAN (but that's how it is
done already).

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [PATCH 1/2] net: vlan: 802.1ad S-VLAN support
From: David Lamparter @ 2011-11-12 14:14 UTC (permalink / raw)
  To: David Miller; +Cc: equinox, netdev, kaber, Michał Mirosław
In-Reply-To: <20111111.202235.1981500713669985784.davem@davemloft.net>

On Fri, Nov 11, 2011 at 08:22:35PM -0500, David Miller wrote:
> > @@ -87,7 +97,8 @@ struct vlan_group {
> >  					    */
> >  	unsigned int		nr_vlans;
> >  	struct hlist_node	hlist;	/* linked list */
> > -	struct net_device **vlan_devices_arrays[VLAN_GROUP_ARRAY_SPLIT_PARTS];
> > +	struct net_device **vlan_devices_arrays[VLAN_N_PROTOCOL]
> > +						[VLAN_GROUP_ARRAY_SPLIT_PARTS];
> >  	struct rcu_head		rcu;
> >  };
> 
> This is a terrible waste of memory.  You're now using 5 times as much space,
> the vast majority of which will be entirely unused.

VLAN_GROUP_ARRAY_SPLIT_PARTS is 8; so the memory consumption of this was
previously 8 * ptr = 64 bytes and is now 5 * 8 * ptr = 320 bytes. I thought
those extra 256 bytes per VLAN-carrying master device are worth the
simplicity, especially since this saves me impacts on 802.1Q C-VLAN
lookup performance elsewhere.

The individual VLAN_GROUP_ARRAY_SPLIT_PARTS are allocated on-demand in
vlan_group_prealloc_vid (net/8021q/vlan.c). They aren't freed if they
get empty, only when all VLANs disappear; that's an issue with the
existing code that could be fixed independently.

> I don't even think it's semantically correct, all these alias QinQ protocol
> values don't provide completely new VLAN_ID name spaces at all.  So this
> layout doesn't even make any sense, you're allowing for something that isn't
> even allowed.

The namespaces are separate. 802.1ad goes to quite some lengths to
replace all occurences of "VLAN ID" with "C-VLAN ID" and introduces the
separate "S-VLAN ID".

Nortel's 0x9?00 protocol values have no spec that i know of... oh, I
could save 192 bytes with an "[ ] Legacy Nortel protocol IDs" switch,
I guess.

> Rework these datastructures to eliminate the wastage please.

I could reverse the order of the id vs. protocol lookups, i.e. grab the
VLAN ID from the table as it was previously and then go hunt for the
device with the correct protocol. That'd require chain-linking or
tabling the devices and make 802.1Q normal/C-VLAN device lookups
slower, which I'd very much like to avoid.

The alternative would be a completely different structure for non-0x8100
VLANs, but that's an entirely different level of complexity there, which
I'm also trying to avoid...


Well, hmm... Suggestions?


-equi

^ permalink raw reply

* Re: [PATCH 1/1] net: fec: convert to clk_prepare/clk_unprepare
From: Uwe Kleine-König @ 2011-11-12 12:49 UTC (permalink / raw)
  To: Richard Zhao
  Cc: netdev, davem, linux-arm-kernel, kernel, shawn.guo, eric.miao
In-Reply-To: <1320901304-13157-1-git-send-email-richard.zhao@linaro.org>

Hello Richard,

On Thu, Nov 10, 2011 at 01:01:44PM +0800, Richard Zhao wrote:
> Signed-off-by: Richard Zhao <richard.zhao@linaro.org>
> ---
>  drivers/net/ethernet/freescale/fec.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
> index 1124ce0..92e3585 100644
> --- a/drivers/net/ethernet/freescale/fec.c
> +++ b/drivers/net/ethernet/freescale/fec.c
> @@ -1588,6 +1588,7 @@ fec_probe(struct platform_device *pdev)
>  		ret = PTR_ERR(fep->clk);
>  		goto failed_clk;
>  	}
> +	clk_prepare(fep->clk);
What about error checking?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply

* Re: [PATCH] r8169: add module param for control of ASPM  disable
From: Francois Romieu @ 2011-11-12 10:46 UTC (permalink / raw)
  To: Todd Broch
  Cc: Matthew Garrett, Realtek linux nic maintainers, netdev,
	Hayes Wang
In-Reply-To: <20111112045935.GA24292@srcf.ucam.org>

Matthew Garrett <mjg59@srcf.ucam.org> :
> On Fri, Nov 11, 2011 at 03:05:55PM -0800, Todd Broch wrote:
> > ASPM has been disabled in this driver by default as its been
> > implicated in stability issues on at least one platform.  This CL adds
> > a  module parameter to allow control of ASPM disable.  The default
> > value is to enable ASPM again as its provides signficant (200mW) power
> > savings on the platform I tested.
> 
> The default is then to break some hardware. That's not reasonable. 
> Instead, we need to identify which chip versions are broken and limit 
> the blacklist to them.

Yes, assuming it's only a device problem.

Todd, which chipset(s) (XID or such) was it tested with ?

It would be nice to know the kind of use as well (desktop / laptop / server).

Btw, what is a 'CL' ?

-- 
Ueimor

^ permalink raw reply

* Re: creating netdev queues on the fly?
From: Eric Dumazet @ 2011-11-12  9:45 UTC (permalink / raw)
  To: Dave Taht
  Cc: Denys Fedoryshchenko, Helmut Schaa, Johannes Berg, netdev,
	linux-wireless
In-Reply-To: <1321009374.2548.31.camel@edumazet-laptop>

Le vendredi 11 novembre 2011 à 12:02 +0100, Eric Dumazet a écrit :

> I would see a new Qdisc/Class property, like the rate estimator, that we
> can attach to any Qdisc/Class with a new tc option.
> 
> Even without any limit enforcing (might be Random Early Detection by the
> way), it could be used to get a Queue Delay estimation, using EWMA
> 
> avqdelay = avqdelay*(1-W) + qdelay*W;
> W = 2^(-ewma_log);
> 
> tc [ qdisc | class] add [...] [est 1sec 8sec] [delayest ewma_log ] ..
> 
> tc -s -d qdisc ...
> qdisc htb 1: root refcnt 2 r2q 10 default 1 direct_packets_stat 0 ver 3.17
>  Sent 3596219 bytes 2567 pkt (dropped 238, overlimits 3797 requeues 0) 
>  rate 2557Kbit 215pps backlog 0b 0p requeues 0 
>  delay 91ms
> 
> 


I coded the thing (delayest at qdisc level) and got interesting values,
for example with following HTB setup :

DEV=eth3
MTU=1500
rate=10mbit
EST="est 1sec 8sec delayest 6"

tc qdisc del dev $DEV root
tc qdisc add dev $DEV root handle 1: ${EST} \
	htb default 1 
tc class add dev $DEV parent 1: classid 1:1 htb \
	rate ${rate} mtu 40000 quantum 80000
tc qdisc add dev $DEV parent 1:1 handle 10: ${EST} pfifo limit 3


With light trafic on my x86_64 machine I have :

# tcnew -s -d qdisc show dev eth3
qdisc htb 1: root refcnt 17 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 78216 bytes 569 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 9040bit 13pps backlog 0b 0p requeues 0 
 delay 1126 ns log 6
qdisc pfifo 10: parent 1:1 limit 3p
 Sent 78216 bytes 569 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 9040bit 13pps backlog 0b 0p requeues 0 
 delay 731 ns log 6

Wow, 1126ns of overhead per packet...

This is the prototype kernel patch I used on top of net-next :

diff --git a/include/linux/gen_stats.h b/include/linux/gen_stats.h
index 552c8a0..5ad57a6 100644
--- a/include/linux/gen_stats.h
+++ b/include/linux/gen_stats.h
@@ -63,5 +63,16 @@ struct gnet_estimator {
 	unsigned char	ewma_log;
 };
 
+/**
+ * struct gnet_qdelay - queue delay configuration / reports
+ * @avdelay: average queue delay in ns
+ * @limit: packets delayed more than this value are dropped
+ * @avdelaylog: the log of measurement window weight
+ */
+struct gnet_qdelay {
+	__u64 avdelay;
+	__u64 limit;
+	__u32 avdelaylog;
+};
 
 #endif /* __LINUX_GEN_STATS_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 8e872ea..61c66ec 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -484,6 +484,7 @@ enum {
 	TCA_FCNT,
 	TCA_STATS2,
 	TCA_STAB,
+	TCA_QDELAY,
 	__TCA_MAX
 };
 
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index f6bb08b..e293228 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -42,6 +42,13 @@ struct qdisc_size_table {
 	u16			data[];
 };
 
+/*
+ * qdisc/class avdelay is computed using EWMA, with a fixed factor of 16
+ * Only the weight is a parameter (avdelaylog)
+ * With u64 values, this leaves 48 bits, a max of 281474 seconds.
+ */
+#define TCQ_AVDELAY_FACTOR 16
+
 struct Qdisc {
 	int 			(*enqueue)(struct sk_buff *skb, struct Qdisc *dev);
 	struct sk_buff *	(*dequeue)(struct Qdisc *dev);
@@ -50,8 +57,11 @@ struct Qdisc {
 #define TCQ_F_INGRESS		2
 #define TCQ_F_CAN_BYPASS	4
 #define TCQ_F_MQROOT		8
+#define TCQ_F_QDELAY		0x10
 #define TCQ_F_WARN_NONWC	(1 << 16)
-	int			padded;
+	u8			padded;
+	u8			avdelaylog; /* avdelay EWMA weight */
+	u8			_pad[2];
 	const struct Qdisc_ops	*ops;
 	struct qdisc_size_table	__rcu *stab;
 	struct list_head	list;
@@ -80,6 +90,10 @@ struct Qdisc {
 	struct gnet_stats_basic_packed bstats;
 	unsigned int		__state;
 	struct gnet_stats_queue	qstats;
+
+	/* average queue delay in ns << TCQ_AVDELAY_FACTOR */
+	u64			avdelay;
+
 	struct rcu_head		rcu_head;
 	spinlock_t		busylock;
 	u32			limit;
@@ -219,6 +233,9 @@ struct tcf_proto {
 };
 
 struct qdisc_skb_cb {
+#ifdef CONFIG_NET_SCHED_QDELAY
+	ktime_t			enqueue_time;
+#endif
 	unsigned int		pkt_len;
 	long			data[];
 };
@@ -467,6 +484,14 @@ static inline void qdisc_bstats_update(struct Qdisc *sch,
 				       const struct sk_buff *skb)
 {
 	bstats_update(&sch->bstats, skb);
+#ifdef CONFIG_NET_SCHED_QDELAY
+	if (sch->flags & TCQ_F_QDELAY) {
+		u64 delay = ktime_to_ns(ktime_sub(ktime_get(),
+					qdisc_skb_cb(skb)->enqueue_time));
+		delay <<= TCQ_AVDELAY_FACTOR;
+		sch->avdelay += (delay >> sch->avdelaylog) - (sch->avdelay >> sch->avdelaylog);
+	}
+#endif
 }
 
 static inline int __qdisc_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
diff --git a/net/core/dev.c b/net/core/dev.c
index 6ba50a1..587534d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2402,6 +2402,13 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
 	int rc;
 
 	qdisc_skb_cb(skb)->pkt_len = skb->len;
+
+#ifdef CONFIG_NET_SCHED_QDELAY
+	qdisc_skb_cb(skb)->enqueue_time.tv64 = 0;
+	if (q->flags & TCQ_F_QDELAY)
+		qdisc_skb_cb(skb)->enqueue_time = ktime_get();
+#endif
+
 	qdisc_calculate_pkt_len(skb, q);
 	/*
 	 * Heuristic to force contended enqueues to serialize on a
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 2590e91..028f882 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -470,6 +470,17 @@ config NET_CLS_ACT
 	  A recent version of the iproute2 package is required to use
 	  extended matches.
 
+config NET_SCHED_QDELAY
+	bool "QDISC/CLASS queue delay Estimators and Limits"
+	---help---
+	  Say Y here if you want to be able to track queue delays, and
+	  be able to drop packets if they stay in a queue a too long time.
+	  It adds some overhead per packet, since it needs to get precise
+	  time at enqueue and dequeue time.
+
+	  A recent version of the iproute2 package is required to use
+	  extended matches.
+
 config NET_ACT_POLICE
 	tristate "Traffic Policing"
         depends on NET_CLS_ACT 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index dca6c1a..212fba9 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -842,6 +842,17 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
 			}
 			rcu_assign_pointer(sch->stab, stab);
 		}
+		if (tca[TCA_QDELAY]) {
+			struct gnet_qdelay *parm = nla_data(tca[TCA_QDELAY]);
+			err = -EINVAL;
+			if (nla_len(tca[TCA_QDELAY]) < sizeof(*parm))
+				goto err_out4;
+			sch->avdelaylog = parm->avdelaylog;
+			sch->flags |= TCQ_F_QDELAY;
+		} else { /* temporary testing */
+			sch->avdelaylog = 6;
+			sch->flags |= TCQ_F_QDELAY;
+		}
 		if (tca[TCA_RATE]) {
 			spinlock_t *root_lock;
 
@@ -1206,6 +1217,16 @@ static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid,
 	if (stab && qdisc_dump_stab(skb, stab) < 0)
 		goto nla_put_failure;
 
+#ifdef CONFIG_NET_SCHED_QDELAY
+	if (q->flags & TCQ_F_QDELAY) {
+		struct gnet_qdelay qdelay;
+
+		memset(&qdelay, 0, sizeof(qdelay));	
+		qdelay.avdelay = q->avdelay >> TCQ_AVDELAY_FACTOR;
+		qdelay.avdelaylog = q->avdelaylog;
+		NLA_PUT(skb, TCA_QDELAY, sizeof(qdelay), &qdelay);
+	}
+#endif
 	if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS, TCA_XSTATS,
 					 qdisc_root_sleeping_lock(q), &d) < 0)
 		goto nla_put_failure;

^ permalink raw reply related

* Re: [PATCH 1/2] net: vlan: 802.1ad S-VLAN support
From: Michał Mirosław @ 2011-11-12  9:25 UTC (permalink / raw)
  To: David Miller; +Cc: equinox, netdev, kaber
In-Reply-To: <20111111.202235.1981500713669985784.davem@davemloft.net>

2011/11/12 David Miller <davem@davemloft.net>:
> From: David Lamparter <equinox@diac24.net>
> Date: Sat,  5 Nov 2011 17:54:14 +0100
>
>> @@ -87,7 +97,8 @@ struct vlan_group {
>>                                           */
>>       unsigned int            nr_vlans;
>>       struct hlist_node       hlist;  /* linked list */
>> -     struct net_device **vlan_devices_arrays[VLAN_GROUP_ARRAY_SPLIT_PARTS];
>> +     struct net_device **vlan_devices_arrays[VLAN_N_PROTOCOL]
>> +                                             [VLAN_GROUP_ARRAY_SPLIT_PARTS];
>>       struct rcu_head         rcu;
>>  };
> This is a terrible waste of memory.  You're now using 5 times as much space,
> the vast majority of which will be entirely unused.
>
> I don't even think it's semantically correct, all these alias QinQ protocol
> values don't provide completely new VLAN_ID name spaces at all.  So this
> layout doesn't even make any sense, you're allowing for something that isn't
> even allowed.

While I agree about the memory usage argument, I also like the
separation idea as is. If you want VLANs encapsulated in other
protocols to alias, then you can still do it by attaching them to
properly configured bond or bridge.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Dear Winner
From: Christopher Knight Morrow @ 2011-11-12  7:48 UTC (permalink / raw)





Dear Winner,
I am please to announce to you that your email has just won the sum of
500,000.00 GBP from MICROSOFT LOTTERY SWEEPSTAKES DRAW.Your email address as drawn and attached ID number: BLF/9080648302/11. contact Jim Frank who shall by duty guide you through the process to facilitate the release of your winnings to you 
Mr Jim Frank
E-mail: jimfrank39@gmail.com

Online Director
Christopher Knight

^ permalink raw reply

* Re: [patch net-next V7] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-11-12  8:18 UTC (permalink / raw)
  To: David Miller
  Cc: fbl, netdev, eric.dumazet, bhutchings, shemminger, fubar, andy,
	tgraf, ebiederm, mirqus, kaber, greearb, jesse, benjamin.poirier,
	jzupka, ivecera
In-Reply-To: <20111112.004555.1043819970109229358.davem@davemloft.net>

Sat, Nov 12, 2011 at 06:45:55AM CET, davem@davemloft.net wrote:
>From: Jiri Pirko <jpirko@redhat.com>
>Date: Sat, 12 Nov 2011 01:15:00 +0100
>
>> Yes, this should be checked. I missed that. I would like to address this
>> as follow up patch because I'm sure it's not the last bug in team (I
>> would be sending Vx of the patch for long time :( )
>
>No, Jiri.  Known bugs with obvious fixes get fixed first.
>
>You see it as an endless process, but rather the code just isn't ready
>yet if people can still find easy to diagnose and fix flaws so
>quickly.

Okay, v8 posted.

thanks Dave.

^ permalink raw reply

* [patch net-next V8] net: introduce ethernet teaming device
From: Jiri Pirko @ 2011-11-12  8:16 UTC (permalink / raw)
  To: netdev
  Cc: davem, eric.dumazet, bhutchings, shemminger, fubar, andy, tgraf,
	ebiederm, mirqus, kaber, greearb, jesse, fbl, benjamin.poirier,
	jzupka, ivecera

This patch introduces new network device called team. It supposes to be
very fast, simple, userspace-driven alternative to existing bonding
driver.

Userspace library called libteam with couple of demo apps is available
here:
https://github.com/jpirko/libteam
Note it's still in its dipers atm.

team<->libteam use generic netlink for communication. That and rtnl
suppose to be the only way to configure team device, no sysfs etc.

Python binding of libteam was recently introduced.
Daemon providing arpmon/miimon active-backup functionality will be
introduced shortly. All what's necessary is already implemented in
kernel team driver.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

v7->v8:
	- check ndo_ndo_vlan_rx_[add/kill]_vid functions before calling
	  them.
	- use dev_kfree_skb_any() instead of dev_kfree_skb()

v6->v7:
	- transmit and receive functions are not checked in hot paths.
	  That also resolves memory leak on transmit when no port is
	  present

v5->v6:
	- changed couple of _rcu calls to non _rcu ones in non-readers

v4->v5:
	- team_change_mtu() uses team->lock while travesing though port
	  list
	- mac address changes are moved completely to jurisdiction of
	  userspace daemon. This way the daemon can do FOM1, FOM2 and
	  possibly other weird things with mac addresses.
	  Only round-robin mode sets up all ports to bond's address then
	  enslaved.
	- Extended Kconfig text

v3->v4:
	- remove redundant synchronize_rcu from __team_change_mode()
	- revert "set and clear of mode_ops happens per pointer, not per
	  byte"
	- extend comment of function __team_change_mode()

v2->v3:
	- team_change_mtu() uses rcu version of list traversal to unwind
	- set and clear of mode_ops happens per pointer, not per byte
	- port hashlist changed to be embedded into team structure
	- error branch in team_port_enter() does cleanup now
	- fixed rtln->rtnl

v1->v2:
	- modes are made as modules. Makes team more modular and
	  extendable.
	- several commenters' nitpicks found on v1 were fixed
	- several other bugs were fixed.
	- note I ignored Eric's comment about roundrobin port selector
	  as Eric's way may be easily implemented as another mode (mode
	  "random") in future.
---
 Documentation/networking/team.txt         |    2 +
 MAINTAINERS                               |    7 +
 drivers/net/Kconfig                       |    2 +
 drivers/net/Makefile                      |    1 +
 drivers/net/team/Kconfig                  |   43 +
 drivers/net/team/Makefile                 |    7 +
 drivers/net/team/team.c                   | 1583 +++++++++++++++++++++++++++++
 drivers/net/team/team_mode_activebackup.c |  137 +++
 drivers/net/team/team_mode_roundrobin.c   |  107 ++
 include/linux/Kbuild                      |    1 +
 include/linux/if.h                        |    1 +
 include/linux/if_team.h                   |  242 +++++
 12 files changed, 2133 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/networking/team.txt
 create mode 100644 drivers/net/team/Kconfig
 create mode 100644 drivers/net/team/Makefile
 create mode 100644 drivers/net/team/team.c
 create mode 100644 drivers/net/team/team_mode_activebackup.c
 create mode 100644 drivers/net/team/team_mode_roundrobin.c
 create mode 100644 include/linux/if_team.h

diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 0000000..5a01368
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+	https://github.com/jpirko/libteam
diff --git a/MAINTAINERS b/MAINTAINERS
index 4808256..8d94169 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6484,6 +6484,13 @@ W:	http://tcp-lp-mod.sourceforge.net/
 S:	Maintained
 F:	net/ipv4/tcp_lp.c
 
+TEAM DRIVER
+M:	Jiri Pirko <jpirko@redhat.com>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	drivers/net/team/
+F:	include/linux/if_team.h
+
 TEGRA SUPPORT
 M:	Colin Cross <ccross@android.com>
 M:	Olof Johansson <olof@lixom.net>
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 583f66c..b3020be 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -125,6 +125,8 @@ config IFB
 	  'ifb1' etc.
 	  Look at the iproute2 documentation directory for usage etc
 
+source "drivers/net/team/Kconfig"
+
 config MACVLAN
 	tristate "MAC-VLAN support (EXPERIMENTAL)"
 	depends on EXPERIMENTAL
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index fa877cd..4e4ebfe 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_NET) += Space.o loopback.o
 obj-$(CONFIG_NETCONSOLE) += netconsole.o
 obj-$(CONFIG_PHYLIB) += phy/
 obj-$(CONFIG_RIONET) += rionet.o
+obj-$(CONFIG_NET_TEAM) += team/
 obj-$(CONFIG_TUN) += tun.o
 obj-$(CONFIG_VETH) += veth.o
 obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
diff --git a/drivers/net/team/Kconfig b/drivers/net/team/Kconfig
new file mode 100644
index 0000000..248a144
--- /dev/null
+++ b/drivers/net/team/Kconfig
@@ -0,0 +1,43 @@
+menuconfig NET_TEAM
+	tristate "Ethernet team driver support (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	---help---
+	  This allows one to create virtual interfaces that teams together
+	  multiple ethernet devices.
+
+	  Team devices can be added using the "ip" command from the
+	  iproute2 package:
+
+	  "ip link add link [ address MAC ] [ NAME ] type team"
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called team.
+
+if NET_TEAM
+
+config NET_TEAM_MODE_ROUNDROBIN
+	tristate "Round-robin mode support"
+	depends on NET_TEAM
+	---help---
+	  Basic mode where port used for transmitting packets is selected in
+	  round-robin fashion using packet counter.
+
+	  All added ports are setup to have bond's mac address.
+
+	  To compile this team mode as a module, choose M here: the module
+	  will be called team_mode_roundrobin.
+
+config NET_TEAM_MODE_ACTIVEBACKUP
+	tristate "Active-backup mode support"
+	depends on NET_TEAM
+	---help---
+	  Only one port is active at a time and the rest of ports are used
+	  for backup.
+
+	  Mac addresses of ports are not modified. Userspace is responsible
+	  to do so.
+
+	  To compile this team mode as a module, choose M here: the module
+	  will be called team_mode_activebackup.
+
+endif # NET_TEAM
diff --git a/drivers/net/team/Makefile b/drivers/net/team/Makefile
new file mode 100644
index 0000000..85f2028
--- /dev/null
+++ b/drivers/net/team/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the network team driver
+#
+
+obj-$(CONFIG_NET_TEAM) += team.o
+obj-$(CONFIG_NET_TEAM_MODE_ROUNDROBIN) += team_mode_roundrobin.o
+obj-$(CONFIG_NET_TEAM_MODE_ACTIVEBACKUP) += team_mode_activebackup.o
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
new file mode 100644
index 0000000..60672bb
--- /dev/null
+++ b/drivers/net/team/team.c
@@ -0,0 +1,1583 @@
+/*
+ * net/drivers/team/team.c - Network team device driver
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/rcupdate.h>
+#include <linux/errno.h>
+#include <linux/ctype.h>
+#include <linux/notifier.h>
+#include <linux/netdevice.h>
+#include <linux/if_arp.h>
+#include <linux/socket.h>
+#include <linux/etherdevice.h>
+#include <linux/rtnetlink.h>
+#include <net/rtnetlink.h>
+#include <net/genetlink.h>
+#include <net/netlink.h>
+#include <linux/if_team.h>
+
+#define DRV_NAME "team"
+
+
+/**********
+ * Helpers
+ **********/
+
+#define team_port_exists(dev) (dev->priv_flags & IFF_TEAM_PORT)
+
+static struct team_port *team_port_get_rcu(const struct net_device *dev)
+{
+	struct team_port *port = rcu_dereference(dev->rx_handler_data);
+
+	return team_port_exists(dev) ? port : NULL;
+}
+
+static struct team_port *team_port_get_rtnl(const struct net_device *dev)
+{
+	struct team_port *port = rtnl_dereference(dev->rx_handler_data);
+
+	return team_port_exists(dev) ? port : NULL;
+}
+
+/*
+ * Since the ability to change mac address for open port device is tested in
+ * team_port_add, this function can be called without control of return value
+ */
+static int __set_port_mac(struct net_device *port_dev,
+			  const unsigned char *dev_addr)
+{
+	struct sockaddr addr;
+
+	memcpy(addr.sa_data, dev_addr, ETH_ALEN);
+	addr.sa_family = ARPHRD_ETHER;
+	return dev_set_mac_address(port_dev, &addr);
+}
+
+int team_port_set_orig_mac(struct team_port *port)
+{
+	return __set_port_mac(port->dev, port->orig.dev_addr);
+}
+
+int team_port_set_team_mac(struct team_port *port)
+{
+	return __set_port_mac(port->dev, port->team->dev->dev_addr);
+}
+EXPORT_SYMBOL(team_port_set_team_mac);
+
+
+/*******************
+ * Options handling
+ *******************/
+
+void team_options_register(struct team *team, struct team_option *option,
+			   size_t option_count)
+{
+	int i;
+
+	for (i = 0; i < option_count; i++, option++)
+		list_add_tail(&option->list, &team->option_list);
+}
+EXPORT_SYMBOL(team_options_register);
+
+static void __team_options_change_check(struct team *team,
+					struct team_option *changed_option);
+
+static void __team_options_unregister(struct team *team,
+				      struct team_option *option,
+				      size_t option_count)
+{
+	int i;
+
+	for (i = 0; i < option_count; i++, option++)
+		list_del(&option->list);
+}
+
+void team_options_unregister(struct team *team, struct team_option *option,
+			     size_t option_count)
+{
+	__team_options_unregister(team, option, option_count);
+	__team_options_change_check(team, NULL);
+}
+EXPORT_SYMBOL(team_options_unregister);
+
+static int team_option_get(struct team *team, struct team_option *option,
+			   void *arg)
+{
+	return option->getter(team, arg);
+}
+
+static int team_option_set(struct team *team, struct team_option *option,
+			   void *arg)
+{
+	int err;
+
+	err = option->setter(team, arg);
+	if (err)
+		return err;
+
+	__team_options_change_check(team, option);
+	return err;
+}
+
+/****************
+ * Mode handling
+ ****************/
+
+static LIST_HEAD(mode_list);
+static DEFINE_SPINLOCK(mode_list_lock);
+
+static struct team_mode *__find_mode(const char *kind)
+{
+	struct team_mode *mode;
+
+	list_for_each_entry(mode, &mode_list, list) {
+		if (strcmp(mode->kind, kind) == 0)
+			return mode;
+	}
+	return NULL;
+}
+
+static bool is_good_mode_name(const char *name)
+{
+	while (*name != '\0') {
+		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
+			return false;
+		name++;
+	}
+	return true;
+}
+
+int team_mode_register(struct team_mode *mode)
+{
+	int err = 0;
+
+	if (!is_good_mode_name(mode->kind) ||
+	    mode->priv_size > TEAM_MODE_PRIV_SIZE)
+		return -EINVAL;
+	spin_lock(&mode_list_lock);
+	if (__find_mode(mode->kind)) {
+		err = -EEXIST;
+		goto unlock;
+	}
+	list_add_tail(&mode->list, &mode_list);
+unlock:
+	spin_unlock(&mode_list_lock);
+	return err;
+}
+EXPORT_SYMBOL(team_mode_register);
+
+int team_mode_unregister(struct team_mode *mode)
+{
+	spin_lock(&mode_list_lock);
+	list_del_init(&mode->list);
+	spin_unlock(&mode_list_lock);
+	return 0;
+}
+EXPORT_SYMBOL(team_mode_unregister);
+
+static struct team_mode *team_mode_get(const char *kind)
+{
+	struct team_mode *mode;
+
+	spin_lock(&mode_list_lock);
+	mode = __find_mode(kind);
+	if (!mode) {
+		spin_unlock(&mode_list_lock);
+		request_module("team-mode-%s", kind);
+		spin_lock(&mode_list_lock);
+		mode = __find_mode(kind);
+	}
+	if (mode)
+		if (!try_module_get(mode->owner))
+			mode = NULL;
+
+	spin_unlock(&mode_list_lock);
+	return mode;
+}
+
+static void team_mode_put(const struct team_mode *mode)
+{
+	module_put(mode->owner);
+}
+
+static bool team_dummy_transmit(struct team *team, struct sk_buff *skb)
+{
+	dev_kfree_skb_any(skb);
+	return false;
+}
+
+rx_handler_result_t team_dummy_receive(struct team *team,
+				       struct team_port *port,
+				       struct sk_buff *skb)
+{
+	return RX_HANDLER_ANOTHER;
+}
+
+static void team_adjust_ops(struct team *team)
+{
+	/*
+	 * To avoid checks in rx/tx skb paths, ensure here that non-null and
+	 * correct ops are always set.
+	 */
+
+	if (list_empty(&team->port_list) ||
+	    !team->mode || !team->mode->ops->transmit)
+		team->ops.transmit = team_dummy_transmit;
+	else
+		team->ops.transmit = team->mode->ops->transmit;
+
+	if (list_empty(&team->port_list) ||
+	    !team->mode || !team->mode->ops->receive)
+		team->ops.receive = team_dummy_receive;
+	else
+		team->ops.receive = team->mode->ops->receive;
+}
+
+/*
+ * We can benefit from the fact that it's ensured no port is present
+ * at the time of mode change. Therefore no packets are in fly so there's no
+ * need to set mode operations in any special way.
+ */
+static int __team_change_mode(struct team *team,
+			      const struct team_mode *new_mode)
+{
+	/* Check if mode was previously set and do cleanup if so */
+	if (team->mode) {
+		void (*exit_op)(struct team *team) = team->ops.exit;
+
+		/* Clear ops area so no callback is called any longer */
+		memset(&team->ops, 0, sizeof(struct team_mode_ops));
+		team_adjust_ops(team);
+
+		if (exit_op)
+			exit_op(team);
+		team_mode_put(team->mode);
+		team->mode = NULL;
+		/* zero private data area */
+		memset(&team->mode_priv, 0,
+		       sizeof(struct team) - offsetof(struct team, mode_priv));
+	}
+
+	if (!new_mode)
+		return 0;
+
+	if (new_mode->ops->init) {
+		int err;
+
+		err = new_mode->ops->init(team);
+		if (err)
+			return err;
+	}
+
+	team->mode = new_mode;
+	memcpy(&team->ops, new_mode->ops, sizeof(struct team_mode_ops));
+	team_adjust_ops(team);
+
+	return 0;
+}
+
+static int team_change_mode(struct team *team, const char *kind)
+{
+	struct team_mode *new_mode;
+	struct net_device *dev = team->dev;
+	int err;
+
+	if (!list_empty(&team->port_list)) {
+		netdev_err(dev, "No ports can be present during mode change\n");
+		return -EBUSY;
+	}
+
+	if (team->mode && strcmp(team->mode->kind, kind) == 0) {
+		netdev_err(dev, "Unable to change to the same mode the team is in\n");
+		return -EINVAL;
+	}
+
+	new_mode = team_mode_get(kind);
+	if (!new_mode) {
+		netdev_err(dev, "Mode \"%s\" not found\n", kind);
+		return -EINVAL;
+	}
+
+	err = __team_change_mode(team, new_mode);
+	if (err) {
+		netdev_err(dev, "Failed to change to mode \"%s\"\n", kind);
+		team_mode_put(new_mode);
+		return err;
+	}
+
+	netdev_info(dev, "Mode changed to \"%s\"\n", kind);
+	return 0;
+}
+
+
+/************************
+ * Rx path frame handler
+ ************************/
+
+/* note: already called with rcu_read_lock */
+static rx_handler_result_t team_handle_frame(struct sk_buff **pskb)
+{
+	struct sk_buff *skb = *pskb;
+	struct team_port *port;
+	struct team *team;
+	rx_handler_result_t res;
+
+	skb = skb_share_check(skb, GFP_ATOMIC);
+	if (!skb)
+		return RX_HANDLER_CONSUMED;
+
+	*pskb = skb;
+
+	port = team_port_get_rcu(skb->dev);
+	team = port->team;
+
+	res = team->ops.receive(team, port, skb);
+	if (res == RX_HANDLER_ANOTHER) {
+		struct team_pcpu_stats *pcpu_stats;
+
+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
+		u64_stats_update_begin(&pcpu_stats->syncp);
+		pcpu_stats->rx_packets++;
+		pcpu_stats->rx_bytes += skb->len;
+		if (skb->pkt_type == PACKET_MULTICAST)
+			pcpu_stats->rx_multicast++;
+		u64_stats_update_end(&pcpu_stats->syncp);
+
+		skb->dev = team->dev;
+	} else {
+		this_cpu_inc(team->pcpu_stats->rx_dropped);
+	}
+
+	return res;
+}
+
+
+/****************
+ * Port handling
+ ****************/
+
+static bool team_port_find(const struct team *team,
+			   const struct team_port *port)
+{
+	struct team_port *cur;
+
+	list_for_each_entry(cur, &team->port_list, list)
+		if (cur == port)
+			return true;
+	return false;
+}
+
+/*
+ * Add/delete port to the team port list. Write guarded by rtnl_lock.
+ * Takes care of correct port->index setup (might be racy).
+ */
+static void team_port_list_add_port(struct team *team,
+				    struct team_port *port)
+{
+	port->index = team->port_count++;
+	hlist_add_head_rcu(&port->hlist,
+			   team_port_index_hash(team, port->index));
+	list_add_tail_rcu(&port->list, &team->port_list);
+}
+
+static void __reconstruct_port_hlist(struct team *team, int rm_index)
+{
+	int i;
+	struct team_port *port;
+
+	for (i = rm_index + 1; i < team->port_count; i++) {
+		port = team_get_port_by_index(team, i);
+		hlist_del_rcu(&port->hlist);
+		port->index--;
+		hlist_add_head_rcu(&port->hlist,
+				   team_port_index_hash(team, port->index));
+	}
+}
+
+static void team_port_list_del_port(struct team *team,
+				   struct team_port *port)
+{
+	int rm_index = port->index;
+
+	hlist_del_rcu(&port->hlist);
+	list_del_rcu(&port->list);
+	__reconstruct_port_hlist(team, rm_index);
+	team->port_count--;
+}
+
+#define TEAM_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
+			    NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
+			    NETIF_F_HIGHDMA | NETIF_F_LRO)
+
+static void __team_compute_features(struct team *team)
+{
+	struct team_port *port;
+	u32 vlan_features = TEAM_VLAN_FEATURES;
+	unsigned short max_hard_header_len = ETH_HLEN;
+
+	list_for_each_entry(port, &team->port_list, list) {
+		vlan_features = netdev_increment_features(vlan_features,
+					port->dev->vlan_features,
+					TEAM_VLAN_FEATURES);
+
+		if (port->dev->hard_header_len > max_hard_header_len)
+			max_hard_header_len = port->dev->hard_header_len;
+	}
+
+	team->dev->vlan_features = vlan_features;
+	team->dev->hard_header_len = max_hard_header_len;
+
+	netdev_change_features(team->dev);
+}
+
+static void team_compute_features(struct team *team)
+{
+	spin_lock(&team->lock);
+	__team_compute_features(team);
+	spin_unlock(&team->lock);
+}
+
+static int team_port_enter(struct team *team, struct team_port *port)
+{
+	int err = 0;
+
+	dev_hold(team->dev);
+	port->dev->priv_flags |= IFF_TEAM_PORT;
+	if (team->ops.port_enter) {
+		err = team->ops.port_enter(team, port);
+		if (err) {
+			netdev_err(team->dev, "Device %s failed to enter team mode\n",
+				   port->dev->name);
+			goto err_port_enter;
+		}
+	}
+
+	return 0;
+
+err_port_enter:
+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
+	dev_put(team->dev);
+
+	return err;
+}
+
+static void team_port_leave(struct team *team, struct team_port *port)
+{
+	if (team->ops.port_leave)
+		team->ops.port_leave(team, port);
+	port->dev->priv_flags &= ~IFF_TEAM_PORT;
+	dev_put(team->dev);
+}
+
+static void __team_port_change_check(struct team_port *port, bool linkup);
+
+static int team_port_add(struct team *team, struct net_device *port_dev)
+{
+	struct net_device *dev = team->dev;
+	struct team_port *port;
+	char *portname = port_dev->name;
+	int err;
+
+	if (port_dev->flags & IFF_LOOPBACK ||
+	    port_dev->type != ARPHRD_ETHER) {
+		netdev_err(dev, "Device %s is of an unsupported type\n",
+			   portname);
+		return -EINVAL;
+	}
+
+	if (team_port_exists(port_dev)) {
+		netdev_err(dev, "Device %s is already a port "
+				"of a team device\n", portname);
+		return -EBUSY;
+	}
+
+	if (port_dev->flags & IFF_UP) {
+		netdev_err(dev, "Device %s is up. Set it down before adding it as a team port\n",
+			   portname);
+		return -EBUSY;
+	}
+
+	port = kzalloc(sizeof(struct team_port), GFP_KERNEL);
+	if (!port)
+		return -ENOMEM;
+
+	port->dev = port_dev;
+	port->team = team;
+
+	port->orig.mtu = port_dev->mtu;
+	err = dev_set_mtu(port_dev, dev->mtu);
+	if (err) {
+		netdev_dbg(dev, "Error %d calling dev_set_mtu\n", err);
+		goto err_set_mtu;
+	}
+
+	memcpy(port->orig.dev_addr, port_dev->dev_addr, ETH_ALEN);
+
+	err = team_port_enter(team, port);
+	if (err) {
+		netdev_err(dev, "Device %s failed to enter team mode\n",
+			   portname);
+		goto err_port_enter;
+	}
+
+	err = dev_open(port_dev);
+	if (err) {
+		netdev_dbg(dev, "Device %s opening failed\n",
+			   portname);
+		goto err_dev_open;
+	}
+
+	err = netdev_set_master(port_dev, dev);
+	if (err) {
+		netdev_err(dev, "Device %s failed to set master\n", portname);
+		goto err_set_master;
+	}
+
+	err = netdev_rx_handler_register(port_dev, team_handle_frame,
+					 port);
+	if (err) {
+		netdev_err(dev, "Device %s failed to register rx_handler\n",
+			   portname);
+		goto err_handler_register;
+	}
+
+	team_port_list_add_port(team, port);
+	team_adjust_ops(team);
+	__team_compute_features(team);
+	__team_port_change_check(port, !!netif_carrier_ok(port_dev));
+
+	netdev_info(dev, "Port device %s added\n", portname);
+
+	return 0;
+
+err_handler_register:
+	netdev_set_master(port_dev, NULL);
+
+err_set_master:
+	dev_close(port_dev);
+
+err_dev_open:
+	team_port_leave(team, port);
+	team_port_set_orig_mac(port);
+
+err_port_enter:
+	dev_set_mtu(port_dev, port->orig.mtu);
+
+err_set_mtu:
+	kfree(port);
+
+	return err;
+}
+
+static int team_port_del(struct team *team, struct net_device *port_dev)
+{
+	struct net_device *dev = team->dev;
+	struct team_port *port;
+	char *portname = port_dev->name;
+
+	port = team_port_get_rtnl(port_dev);
+	if (!port || !team_port_find(team, port)) {
+		netdev_err(dev, "Device %s does not act as a port of this team\n",
+			   portname);
+		return -ENOENT;
+	}
+
+	__team_port_change_check(port, false);
+	team_port_list_del_port(team, port);
+	team_adjust_ops(team);
+	netdev_rx_handler_unregister(port_dev);
+	netdev_set_master(port_dev, NULL);
+	dev_close(port_dev);
+	team_port_leave(team, port);
+	team_port_set_orig_mac(port);
+	dev_set_mtu(port_dev, port->orig.mtu);
+	synchronize_rcu();
+	kfree(port);
+	netdev_info(dev, "Port device %s removed\n", portname);
+	__team_compute_features(team);
+
+	return 0;
+}
+
+
+/*****************
+ * Net device ops
+ *****************/
+
+static const char team_no_mode_kind[] = "*NOMODE*";
+
+static int team_mode_option_get(struct team *team, void *arg)
+{
+	const char **str = arg;
+
+	*str = team->mode ? team->mode->kind : team_no_mode_kind;
+	return 0;
+}
+
+static int team_mode_option_set(struct team *team, void *arg)
+{
+	const char **str = arg;
+
+	return team_change_mode(team, *str);
+}
+
+static struct team_option team_options[] = {
+	{
+		.name = "mode",
+		.type = TEAM_OPTION_TYPE_STRING,
+		.getter = team_mode_option_get,
+		.setter = team_mode_option_set,
+	},
+};
+
+static int team_init(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	int i;
+
+	team->dev = dev;
+	spin_lock_init(&team->lock);
+
+	team->pcpu_stats = alloc_percpu(struct team_pcpu_stats);
+	if (!team->pcpu_stats)
+		return -ENOMEM;
+
+	for (i = 0; i < TEAM_PORT_HASHENTRIES; i++)
+		INIT_HLIST_HEAD(&team->port_hlist[i]);
+	INIT_LIST_HEAD(&team->port_list);
+
+	team_adjust_ops(team);
+
+	INIT_LIST_HEAD(&team->option_list);
+	team_options_register(team, team_options, ARRAY_SIZE(team_options));
+	netif_carrier_off(dev);
+
+	return 0;
+}
+
+static void team_uninit(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	struct team_port *tmp;
+
+	spin_lock(&team->lock);
+	list_for_each_entry_safe(port, tmp, &team->port_list, list)
+		team_port_del(team, port->dev);
+
+	__team_change_mode(team, NULL); /* cleanup */
+	__team_options_unregister(team, team_options, ARRAY_SIZE(team_options));
+	spin_unlock(&team->lock);
+}
+
+static void team_destructor(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+
+	free_percpu(team->pcpu_stats);
+	free_netdev(dev);
+}
+
+static int team_open(struct net_device *dev)
+{
+	netif_carrier_on(dev);
+	return 0;
+}
+
+static int team_close(struct net_device *dev)
+{
+	netif_carrier_off(dev);
+	return 0;
+}
+
+/*
+ * note: already called with rcu_read_lock
+ */
+static netdev_tx_t team_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	bool tx_success = false;
+	unsigned int len = skb->len;
+
+	tx_success = team->ops.transmit(team, skb);
+	if (tx_success) {
+		struct team_pcpu_stats *pcpu_stats;
+
+		pcpu_stats = this_cpu_ptr(team->pcpu_stats);
+		u64_stats_update_begin(&pcpu_stats->syncp);
+		pcpu_stats->tx_packets++;
+		pcpu_stats->tx_bytes += len;
+		u64_stats_update_end(&pcpu_stats->syncp);
+	} else {
+		this_cpu_inc(team->pcpu_stats->tx_dropped);
+	}
+
+	return NETDEV_TX_OK;
+}
+
+static void team_change_rx_flags(struct net_device *dev, int change)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	int inc;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		if (change & IFF_PROMISC) {
+			inc = dev->flags & IFF_PROMISC ? 1 : -1;
+			dev_set_promiscuity(port->dev, inc);
+		}
+		if (change & IFF_ALLMULTI) {
+			inc = dev->flags & IFF_ALLMULTI ? 1 : -1;
+			dev_set_allmulti(port->dev, inc);
+		}
+	}
+	rcu_read_unlock();
+}
+
+static void team_set_rx_mode(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		dev_uc_sync(port->dev, dev);
+		dev_mc_sync(port->dev, dev);
+	}
+	rcu_read_unlock();
+}
+
+static int team_set_mac_address(struct net_device *dev, void *p)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	struct sockaddr *addr = p;
+
+	memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list)
+		if (team->ops.port_change_mac)
+			team->ops.port_change_mac(team, port);
+	rcu_read_unlock();
+	return 0;
+}
+
+static int team_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	int err;
+
+	/*
+	 * Alhough this is reader, it's guarded by team lock. It's not possible
+	 * to traverse list in reverse under rcu_read_lock
+	 */
+	spin_lock(&team->lock);
+	list_for_each_entry(port, &team->port_list, list) {
+		err = dev_set_mtu(port->dev, new_mtu);
+		if (err) {
+			netdev_err(dev, "Device %s failed to change mtu",
+				   port->dev->name);
+			goto unwind;
+		}
+	}
+	spin_unlock(&team->lock);
+
+	dev->mtu = new_mtu;
+
+	return 0;
+
+unwind:
+	list_for_each_entry_continue_reverse(port, &team->port_list, list)
+		dev_set_mtu(port->dev, dev->mtu);
+	spin_unlock(&team->lock);
+
+	return err;
+}
+
+static struct rtnl_link_stats64 *
+team_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_pcpu_stats *p;
+	u64 rx_packets, rx_bytes, rx_multicast, tx_packets, tx_bytes;
+	u32 rx_dropped = 0, tx_dropped = 0;
+	unsigned int start;
+	int i;
+
+	for_each_possible_cpu(i) {
+		p = per_cpu_ptr(team->pcpu_stats, i);
+		do {
+			start = u64_stats_fetch_begin_bh(&p->syncp);
+			rx_packets	= p->rx_packets;
+			rx_bytes	= p->rx_bytes;
+			rx_multicast	= p->rx_multicast;
+			tx_packets	= p->tx_packets;
+			tx_bytes	= p->tx_bytes;
+		} while (u64_stats_fetch_retry_bh(&p->syncp, start));
+
+		stats->rx_packets	+= rx_packets;
+		stats->rx_bytes		+= rx_bytes;
+		stats->multicast	+= rx_multicast;
+		stats->tx_packets	+= tx_packets;
+		stats->tx_bytes		+= tx_bytes;
+		/*
+		 * rx_dropped & tx_dropped are u32, updated
+		 * without syncp protection.
+		 */
+		rx_dropped	+= p->rx_dropped;
+		tx_dropped	+= p->tx_dropped;
+	}
+	stats->rx_dropped	= rx_dropped;
+	stats->tx_dropped	= tx_dropped;
+	return stats;
+}
+
+static void team_vlan_rx_add_vid(struct net_device *dev, uint16_t vid)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		const struct net_device_ops *ops = port->dev->netdev_ops;
+
+		if (ops->ndo_vlan_rx_add_vid)
+			ops->ndo_vlan_rx_add_vid(port->dev, vid);
+	}
+	rcu_read_unlock();
+}
+
+static void team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		const struct net_device_ops *ops = port->dev->netdev_ops;
+
+		if (ops->ndo_vlan_rx_kill_vid)
+			ops->ndo_vlan_rx_kill_vid(port->dev, vid);
+	}
+	rcu_read_unlock();
+}
+
+static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
+{
+	struct team *team = netdev_priv(dev);
+	int err;
+
+	spin_lock(&team->lock);
+	err = team_port_add(team, port_dev);
+	spin_unlock(&team->lock);
+	return err;
+}
+
+static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
+{
+	struct team *team = netdev_priv(dev);
+	int err;
+
+	spin_lock(&team->lock);
+	err = team_port_del(team, port_dev);
+	spin_unlock(&team->lock);
+	return err;
+}
+
+static const struct net_device_ops team_netdev_ops = {
+	.ndo_init		= team_init,
+	.ndo_uninit		= team_uninit,
+	.ndo_open		= team_open,
+	.ndo_stop		= team_close,
+	.ndo_start_xmit		= team_xmit,
+	.ndo_change_rx_flags	= team_change_rx_flags,
+	.ndo_set_rx_mode	= team_set_rx_mode,
+	.ndo_set_mac_address	= team_set_mac_address,
+	.ndo_change_mtu		= team_change_mtu,
+	.ndo_get_stats64	= team_get_stats64,
+	.ndo_vlan_rx_add_vid	= team_vlan_rx_add_vid,
+	.ndo_vlan_rx_kill_vid	= team_vlan_rx_kill_vid,
+	.ndo_add_slave		= team_add_slave,
+	.ndo_del_slave		= team_del_slave,
+};
+
+
+/***********************
+ * rt netlink interface
+ ***********************/
+
+static void team_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+
+	dev->netdev_ops = &team_netdev_ops;
+	dev->destructor	= team_destructor;
+	dev->tx_queue_len = 0;
+	dev->flags |= IFF_MULTICAST;
+	dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
+
+	/*
+	 * Indicate we support unicast address filtering. That way core won't
+	 * bring us to promisc mode in case a unicast addr is added.
+	 * Let this up to underlay drivers.
+	 */
+	dev->priv_flags |= IFF_UNICAST_FLT;
+
+	dev->features |= NETIF_F_LLTX;
+	dev->features |= NETIF_F_GRO;
+	dev->hw_features = NETIF_F_HW_VLAN_TX |
+			   NETIF_F_HW_VLAN_RX |
+			   NETIF_F_HW_VLAN_FILTER;
+
+	dev->features |= dev->hw_features;
+}
+
+static int team_newlink(struct net *src_net, struct net_device *dev,
+			struct nlattr *tb[], struct nlattr *data[])
+{
+	int err;
+
+	if (tb[IFLA_ADDRESS] == NULL)
+		random_ether_addr(dev->dev_addr);
+
+	err = register_netdevice(dev);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int team_validate(struct nlattr *tb[], struct nlattr *data[])
+{
+	if (tb[IFLA_ADDRESS]) {
+		if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
+			return -EINVAL;
+		if (!is_valid_ether_addr(nla_data(tb[IFLA_ADDRESS])))
+			return -EADDRNOTAVAIL;
+	}
+	return 0;
+}
+
+static struct rtnl_link_ops team_link_ops __read_mostly = {
+	.kind		= DRV_NAME,
+	.priv_size	= sizeof(struct team),
+	.setup		= team_setup,
+	.newlink	= team_newlink,
+	.validate	= team_validate,
+};
+
+
+/***********************************
+ * Generic netlink custom interface
+ ***********************************/
+
+static struct genl_family team_nl_family = {
+	.id		= GENL_ID_GENERATE,
+	.name		= TEAM_GENL_NAME,
+	.version	= TEAM_GENL_VERSION,
+	.maxattr	= TEAM_ATTR_MAX,
+	.netnsok	= true,
+};
+
+static const struct nla_policy team_nl_policy[TEAM_ATTR_MAX + 1] = {
+	[TEAM_ATTR_UNSPEC]			= { .type = NLA_UNSPEC, },
+	[TEAM_ATTR_TEAM_IFINDEX]		= { .type = NLA_U32 },
+	[TEAM_ATTR_LIST_OPTION]			= { .type = NLA_NESTED },
+	[TEAM_ATTR_LIST_PORT]			= { .type = NLA_NESTED },
+};
+
+static const struct nla_policy
+team_nl_option_policy[TEAM_ATTR_OPTION_MAX + 1] = {
+	[TEAM_ATTR_OPTION_UNSPEC]		= { .type = NLA_UNSPEC, },
+	[TEAM_ATTR_OPTION_NAME] = {
+		.type = NLA_STRING,
+		.len = TEAM_STRING_MAX_LEN,
+	},
+	[TEAM_ATTR_OPTION_CHANGED]		= { .type = NLA_FLAG },
+	[TEAM_ATTR_OPTION_TYPE]			= { .type = NLA_U8 },
+	[TEAM_ATTR_OPTION_DATA] = {
+		.type = NLA_BINARY,
+		.len = TEAM_STRING_MAX_LEN,
+	},
+};
+
+static int team_nl_cmd_noop(struct sk_buff *skb, struct genl_info *info)
+{
+	struct sk_buff *msg;
+	void *hdr;
+	int err;
+
+	msg = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(msg, info->snd_pid, info->snd_seq,
+			  &team_nl_family, 0, TEAM_CMD_NOOP);
+	if (IS_ERR(hdr)) {
+		err = PTR_ERR(hdr);
+		goto err_msg_put;
+	}
+
+	genlmsg_end(msg, hdr);
+
+	return genlmsg_unicast(genl_info_net(info), msg, info->snd_pid);
+
+err_msg_put:
+	nlmsg_free(msg);
+
+	return err;
+}
+
+/*
+ * Netlink cmd functions should be locked by following two functions.
+ * To ensure team_uninit would not be called in between, hold rcu_read_lock
+ * all the time.
+ */
+static struct team *team_nl_team_get(struct genl_info *info)
+{
+	struct net *net = genl_info_net(info);
+	int ifindex;
+	struct net_device *dev;
+	struct team *team;
+
+	if (!info->attrs[TEAM_ATTR_TEAM_IFINDEX])
+		return NULL;
+
+	ifindex = nla_get_u32(info->attrs[TEAM_ATTR_TEAM_IFINDEX]);
+	rcu_read_lock();
+	dev = dev_get_by_index_rcu(net, ifindex);
+	if (!dev || dev->netdev_ops != &team_netdev_ops) {
+		rcu_read_unlock();
+		return NULL;
+	}
+
+	team = netdev_priv(dev);
+	spin_lock(&team->lock);
+	return team;
+}
+
+static void team_nl_team_put(struct team *team)
+{
+	spin_unlock(&team->lock);
+	rcu_read_unlock();
+}
+
+static int team_nl_send_generic(struct genl_info *info, struct team *team,
+				int (*fill_func)(struct sk_buff *skb,
+						 struct genl_info *info,
+						 int flags, struct team *team))
+{
+	struct sk_buff *skb;
+	int err;
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = fill_func(skb, info, NLM_F_ACK, team);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_unicast(genl_info_net(info), skb, info->snd_pid);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_fill_options_get_changed(struct sk_buff *skb,
+					    u32 pid, u32 seq, int flags,
+					    struct team *team,
+					    struct team_option *changed_option)
+{
+	struct nlattr *option_list;
+	void *hdr;
+	struct team_option *option;
+
+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
+			  TEAM_CMD_OPTIONS_GET);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
+	option_list = nla_nest_start(skb, TEAM_ATTR_LIST_OPTION);
+	if (!option_list)
+		return -EMSGSIZE;
+
+	list_for_each_entry(option, &team->option_list, list) {
+		struct nlattr *option_item;
+		long arg;
+
+		option_item = nla_nest_start(skb, TEAM_ATTR_ITEM_OPTION);
+		if (!option_item)
+			goto nla_put_failure;
+		NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_NAME, option->name);
+		if (option == changed_option)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_OPTION_CHANGED);
+		switch (option->type) {
+		case TEAM_OPTION_TYPE_U32:
+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_U32);
+			team_option_get(team, option, &arg);
+			NLA_PUT_U32(skb, TEAM_ATTR_OPTION_DATA, arg);
+			break;
+		case TEAM_OPTION_TYPE_STRING:
+			NLA_PUT_U8(skb, TEAM_ATTR_OPTION_TYPE, NLA_STRING);
+			team_option_get(team, option, &arg);
+			NLA_PUT_STRING(skb, TEAM_ATTR_OPTION_DATA,
+				       (char *) arg);
+			break;
+		default:
+			BUG();
+		}
+		nla_nest_end(skb, option_item);
+	}
+
+	nla_nest_end(skb, option_list);
+	return genlmsg_end(skb, hdr);
+
+nla_put_failure:
+	genlmsg_cancel(skb, hdr);
+	return -EMSGSIZE;
+}
+
+static int team_nl_fill_options_get(struct sk_buff *skb,
+				    struct genl_info *info, int flags,
+				    struct team *team)
+{
+	return team_nl_fill_options_get_changed(skb, info->snd_pid,
+						info->snd_seq, NLM_F_ACK,
+						team, NULL);
+}
+
+static int team_nl_cmd_options_get(struct sk_buff *skb, struct genl_info *info)
+{
+	struct team *team;
+	int err;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = team_nl_send_generic(info, team, team_nl_fill_options_get);
+
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static int team_nl_cmd_options_set(struct sk_buff *skb, struct genl_info *info)
+{
+	struct team *team;
+	int err = 0;
+	int i;
+	struct nlattr *nl_option;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = -EINVAL;
+	if (!info->attrs[TEAM_ATTR_LIST_OPTION]) {
+		err = -EINVAL;
+		goto team_put;
+	}
+
+	nla_for_each_nested(nl_option, info->attrs[TEAM_ATTR_LIST_OPTION], i) {
+		struct nlattr *mode_attrs[TEAM_ATTR_OPTION_MAX + 1];
+		enum team_option_type opt_type;
+		struct team_option *option;
+		char *opt_name;
+		bool opt_found = false;
+
+		if (nla_type(nl_option) != TEAM_ATTR_ITEM_OPTION) {
+			err = -EINVAL;
+			goto team_put;
+		}
+		err = nla_parse_nested(mode_attrs, TEAM_ATTR_OPTION_MAX,
+				       nl_option, team_nl_option_policy);
+		if (err)
+			goto team_put;
+		if (!mode_attrs[TEAM_ATTR_OPTION_NAME] ||
+		    !mode_attrs[TEAM_ATTR_OPTION_TYPE] ||
+		    !mode_attrs[TEAM_ATTR_OPTION_DATA]) {
+			err = -EINVAL;
+			goto team_put;
+		}
+		switch (nla_get_u8(mode_attrs[TEAM_ATTR_OPTION_TYPE])) {
+		case NLA_U32:
+			opt_type = TEAM_OPTION_TYPE_U32;
+			break;
+		case NLA_STRING:
+			opt_type = TEAM_OPTION_TYPE_STRING;
+			break;
+		default:
+			goto team_put;
+		}
+
+		opt_name = nla_data(mode_attrs[TEAM_ATTR_OPTION_NAME]);
+		list_for_each_entry(option, &team->option_list, list) {
+			long arg;
+			struct nlattr *opt_data_attr;
+
+			if (option->type != opt_type ||
+			    strcmp(option->name, opt_name))
+				continue;
+			opt_found = true;
+			opt_data_attr = mode_attrs[TEAM_ATTR_OPTION_DATA];
+			switch (opt_type) {
+			case TEAM_OPTION_TYPE_U32:
+				arg = nla_get_u32(opt_data_attr);
+				break;
+			case TEAM_OPTION_TYPE_STRING:
+				arg = (long) nla_data(opt_data_attr);
+				break;
+			default:
+				BUG();
+			}
+			err = team_option_set(team, option, &arg);
+			if (err)
+				goto team_put;
+		}
+		if (!opt_found) {
+			err = -ENOENT;
+			goto team_put;
+		}
+	}
+
+team_put:
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static int team_nl_fill_port_list_get_changed(struct sk_buff *skb,
+					      u32 pid, u32 seq, int flags,
+					      struct team *team,
+					      struct team_port *changed_port)
+{
+	struct nlattr *port_list;
+	void *hdr;
+	struct team_port *port;
+
+	hdr = genlmsg_put(skb, pid, seq, &team_nl_family, flags,
+			  TEAM_CMD_PORT_LIST_GET);
+	if (IS_ERR(hdr))
+		return PTR_ERR(hdr);
+
+	NLA_PUT_U32(skb, TEAM_ATTR_TEAM_IFINDEX, team->dev->ifindex);
+	port_list = nla_nest_start(skb, TEAM_ATTR_LIST_PORT);
+	if (!port_list)
+		return -EMSGSIZE;
+
+	list_for_each_entry(port, &team->port_list, list) {
+		struct nlattr *port_item;
+
+		port_item = nla_nest_start(skb, TEAM_ATTR_ITEM_PORT);
+		if (!port_item)
+			goto nla_put_failure;
+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_IFINDEX, port->dev->ifindex);
+		if (port == changed_port)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_CHANGED);
+		if (port->linkup)
+			NLA_PUT_FLAG(skb, TEAM_ATTR_PORT_LINKUP);
+		NLA_PUT_U32(skb, TEAM_ATTR_PORT_SPEED, port->speed);
+		NLA_PUT_U8(skb, TEAM_ATTR_PORT_DUPLEX, port->duplex);
+		nla_nest_end(skb, port_item);
+	}
+
+	nla_nest_end(skb, port_list);
+	return genlmsg_end(skb, hdr);
+
+nla_put_failure:
+	genlmsg_cancel(skb, hdr);
+	return -EMSGSIZE;
+}
+
+static int team_nl_fill_port_list_get(struct sk_buff *skb,
+				      struct genl_info *info, int flags,
+				      struct team *team)
+{
+	return team_nl_fill_port_list_get_changed(skb, info->snd_pid,
+						  info->snd_seq, NLM_F_ACK,
+						  team, NULL);
+}
+
+static int team_nl_cmd_port_list_get(struct sk_buff *skb,
+				     struct genl_info *info)
+{
+	struct team *team;
+	int err;
+
+	team = team_nl_team_get(info);
+	if (!team)
+		return -EINVAL;
+
+	err = team_nl_send_generic(info, team, team_nl_fill_port_list_get);
+
+	team_nl_team_put(team);
+
+	return err;
+}
+
+static struct genl_ops team_nl_ops[] = {
+	{
+		.cmd = TEAM_CMD_NOOP,
+		.doit = team_nl_cmd_noop,
+		.policy = team_nl_policy,
+	},
+	{
+		.cmd = TEAM_CMD_OPTIONS_SET,
+		.doit = team_nl_cmd_options_set,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = TEAM_CMD_OPTIONS_GET,
+		.doit = team_nl_cmd_options_get,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+	{
+		.cmd = TEAM_CMD_PORT_LIST_GET,
+		.doit = team_nl_cmd_port_list_get,
+		.policy = team_nl_policy,
+		.flags = GENL_ADMIN_PERM,
+	},
+};
+
+static struct genl_multicast_group team_change_event_mcgrp = {
+	.name = TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME,
+};
+
+static int team_nl_send_event_options_get(struct team *team,
+					  struct team_option *changed_option)
+{
+	struct sk_buff *skb;
+	int err;
+	struct net *net = dev_net(team->dev);
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = team_nl_fill_options_get_changed(skb, 0, 0, 0, team,
+					       changed_option);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
+				      GFP_KERNEL);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_send_event_port_list_get(struct team_port *port)
+{
+	struct sk_buff *skb;
+	int err;
+	struct net *net = dev_net(port->team->dev);
+
+	skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	err = team_nl_fill_port_list_get_changed(skb, 0, 0, 0,
+						 port->team, port);
+	if (err < 0)
+		goto err_fill;
+
+	err = genlmsg_multicast_netns(net, skb, 0, team_change_event_mcgrp.id,
+				      GFP_KERNEL);
+	return err;
+
+err_fill:
+	nlmsg_free(skb);
+	return err;
+}
+
+static int team_nl_init(void)
+{
+	int err;
+
+	err = genl_register_family_with_ops(&team_nl_family, team_nl_ops,
+					    ARRAY_SIZE(team_nl_ops));
+	if (err)
+		return err;
+
+	err = genl_register_mc_group(&team_nl_family, &team_change_event_mcgrp);
+	if (err)
+		goto err_change_event_grp_reg;
+
+	return 0;
+
+err_change_event_grp_reg:
+	genl_unregister_family(&team_nl_family);
+
+	return err;
+}
+
+static void team_nl_fini(void)
+{
+	genl_unregister_family(&team_nl_family);
+}
+
+
+/******************
+ * Change checkers
+ ******************/
+
+static void __team_options_change_check(struct team *team,
+					struct team_option *changed_option)
+{
+	int err;
+
+	err = team_nl_send_event_options_get(team, changed_option);
+	if (err)
+		netdev_warn(team->dev, "Failed to send options change via netlink\n");
+}
+
+/* rtnl lock is held */
+static void __team_port_change_check(struct team_port *port, bool linkup)
+{
+	int err;
+
+	if (port->linkup == linkup)
+		return;
+
+	port->linkup = linkup;
+	if (linkup) {
+		struct ethtool_cmd ecmd;
+
+		err = __ethtool_get_settings(port->dev, &ecmd);
+		if (!err) {
+			port->speed = ethtool_cmd_speed(&ecmd);
+			port->duplex = ecmd.duplex;
+			goto send_event;
+		}
+	}
+	port->speed = 0;
+	port->duplex = 0;
+
+send_event:
+	err = team_nl_send_event_port_list_get(port);
+	if (err)
+		netdev_warn(port->team->dev, "Failed to send port change of device %s via netlink\n",
+			    port->dev->name);
+
+}
+
+static void team_port_change_check(struct team_port *port, bool linkup)
+{
+	struct team *team = port->team;
+
+	spin_lock(&team->lock);
+	__team_port_change_check(port, linkup);
+	spin_unlock(&team->lock);
+}
+
+/************************************
+ * Net device notifier event handler
+ ************************************/
+
+static int team_device_event(struct notifier_block *unused,
+			     unsigned long event, void *ptr)
+{
+	struct net_device *dev = (struct net_device *) ptr;
+	struct team_port *port;
+
+	port = team_port_get_rtnl(dev);
+	if (!port)
+		return NOTIFY_DONE;
+
+	switch (event) {
+	case NETDEV_UP:
+		if (netif_carrier_ok(dev))
+			team_port_change_check(port, true);
+	case NETDEV_DOWN:
+		team_port_change_check(port, false);
+	case NETDEV_CHANGE:
+		if (netif_running(port->dev))
+			team_port_change_check(port,
+					       !!netif_carrier_ok(port->dev));
+		break;
+	case NETDEV_UNREGISTER:
+		team_del_slave(port->team->dev, dev);
+		break;
+	case NETDEV_FEAT_CHANGE:
+		team_compute_features(port->team);
+		break;
+	case NETDEV_CHANGEMTU:
+		/* Forbid to change mtu of underlaying device */
+		return NOTIFY_BAD;
+	case NETDEV_PRE_TYPE_CHANGE:
+		/* Forbid to change type of underlaying device */
+		return NOTIFY_BAD;
+	}
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block team_notifier_block __read_mostly = {
+	.notifier_call = team_device_event,
+};
+
+
+/***********************
+ * Module init and exit
+ ***********************/
+
+static int __init team_module_init(void)
+{
+	int err;
+
+	register_netdevice_notifier(&team_notifier_block);
+
+	err = rtnl_link_register(&team_link_ops);
+	if (err)
+		goto err_rtnl_reg;
+
+	err = team_nl_init();
+	if (err)
+		goto err_nl_init;
+
+	return 0;
+
+err_nl_init:
+	rtnl_link_unregister(&team_link_ops);
+
+err_rtnl_reg:
+	unregister_netdevice_notifier(&team_notifier_block);
+
+	return err;
+}
+
+static void __exit team_module_exit(void)
+{
+	team_nl_fini();
+	rtnl_link_unregister(&team_link_ops);
+	unregister_netdevice_notifier(&team_notifier_block);
+}
+
+module_init(team_module_init);
+module_exit(team_module_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Ethernet team device driver");
+MODULE_ALIAS_RTNL_LINK(DRV_NAME);
diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
new file mode 100644
index 0000000..6fe920c
--- /dev/null
+++ b/drivers/net/team/team_mode_activebackup.c
@@ -0,0 +1,137 @@
+/*
+ * net/drivers/team/team_mode_activebackup.c - Active-backup mode for team
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/netdevice.h>
+#include <net/rtnetlink.h>
+#include <linux/if_team.h>
+
+struct ab_priv {
+	struct team_port __rcu *active_port;
+};
+
+static struct ab_priv *ab_priv(struct team *team)
+{
+	return (struct ab_priv *) &team->mode_priv;
+}
+
+static rx_handler_result_t ab_receive(struct team *team, struct team_port *port,
+				      struct sk_buff *skb) {
+	struct team_port *active_port;
+
+	active_port = rcu_dereference(ab_priv(team)->active_port);
+	if (active_port != port)
+		return RX_HANDLER_EXACT;
+	return RX_HANDLER_ANOTHER;
+}
+
+static bool ab_transmit(struct team *team, struct sk_buff *skb)
+{
+	struct team_port *active_port;
+
+	active_port = rcu_dereference(ab_priv(team)->active_port);
+	if (unlikely(!active_port))
+		goto drop;
+	skb->dev = active_port->dev;
+	if (dev_queue_xmit(skb))
+		return false;
+	return true;
+
+drop:
+	dev_kfree_skb_any(skb);
+	return false;
+}
+
+static void ab_port_leave(struct team *team, struct team_port *port)
+{
+	if (ab_priv(team)->active_port == port)
+		rcu_assign_pointer(ab_priv(team)->active_port, NULL);
+}
+
+static int ab_active_port_get(struct team *team, void *arg)
+{
+	u32 *ifindex = arg;
+
+	*ifindex = 0;
+	if (ab_priv(team)->active_port)
+		*ifindex = ab_priv(team)->active_port->dev->ifindex;
+	return 0;
+}
+
+static int ab_active_port_set(struct team *team, void *arg)
+{
+	u32 *ifindex = arg;
+	struct team_port *port;
+
+	list_for_each_entry_rcu(port, &team->port_list, list) {
+		if (port->dev->ifindex == *ifindex) {
+			rcu_assign_pointer(ab_priv(team)->active_port, port);
+			return 0;
+		}
+	}
+	return -ENOENT;
+}
+
+static struct team_option ab_options[] = {
+	{
+		.name = "activeport",
+		.type = TEAM_OPTION_TYPE_U32,
+		.getter = ab_active_port_get,
+		.setter = ab_active_port_set,
+	},
+};
+
+int ab_init(struct team *team)
+{
+	team_options_register(team, ab_options, ARRAY_SIZE(ab_options));
+	return 0;
+}
+
+void ab_exit(struct team *team)
+{
+	team_options_unregister(team, ab_options, ARRAY_SIZE(ab_options));
+}
+
+static const struct team_mode_ops ab_mode_ops = {
+	.init			= ab_init,
+	.exit			= ab_exit,
+	.receive		= ab_receive,
+	.transmit		= ab_transmit,
+	.port_leave		= ab_port_leave,
+};
+
+static struct team_mode ab_mode = {
+	.kind		= "activebackup",
+	.owner		= THIS_MODULE,
+	.priv_size	= sizeof(struct ab_priv),
+	.ops		= &ab_mode_ops,
+};
+
+static int __init ab_init_module(void)
+{
+	return team_mode_register(&ab_mode);
+}
+
+static void __exit ab_cleanup_module(void)
+{
+	team_mode_unregister(&ab_mode);
+}
+
+module_init(ab_init_module);
+module_exit(ab_cleanup_module);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Active-backup mode for team");
+MODULE_ALIAS("team-mode-activebackup");
diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
new file mode 100644
index 0000000..a0e8f80
--- /dev/null
+++ b/drivers/net/team/team_mode_roundrobin.c
@@ -0,0 +1,107 @@
+/*
+ * net/drivers/team/team_mode_roundrobin.c - Round-robin mode for team
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/errno.h>
+#include <linux/netdevice.h>
+#include <linux/if_team.h>
+
+struct rr_priv {
+	unsigned int sent_packets;
+};
+
+static struct rr_priv *rr_priv(struct team *team)
+{
+	return (struct rr_priv *) &team->mode_priv;
+}
+
+static struct team_port *__get_first_port_up(struct team *team,
+					     struct team_port *port)
+{
+	struct team_port *cur;
+
+	if (port->linkup)
+		return port;
+	cur = port;
+	list_for_each_entry_continue_rcu(cur, &team->port_list, list)
+		if (cur->linkup)
+			return cur;
+	list_for_each_entry_rcu(cur, &team->port_list, list) {
+		if (cur == port)
+			break;
+		if (cur->linkup)
+			return cur;
+	}
+	return NULL;
+}
+
+static bool rr_transmit(struct team *team, struct sk_buff *skb)
+{
+	struct team_port *port;
+	int port_index;
+
+	port_index = rr_priv(team)->sent_packets++ % team->port_count;
+	port = team_get_port_by_index_rcu(team, port_index);
+	port = __get_first_port_up(team, port);
+	if (unlikely(!port))
+		goto drop;
+	skb->dev = port->dev;
+	if (dev_queue_xmit(skb))
+		return false;
+	return true;
+
+drop:
+	dev_kfree_skb_any(skb);
+	return false;
+}
+
+static int rr_port_enter(struct team *team, struct team_port *port)
+{
+	return team_port_set_team_mac(port);
+}
+
+static void rr_port_change_mac(struct team *team, struct team_port *port)
+{
+	team_port_set_team_mac(port);
+}
+
+static const struct team_mode_ops rr_mode_ops = {
+	.transmit		= rr_transmit,
+	.port_enter		= rr_port_enter,
+	.port_change_mac	= rr_port_change_mac,
+};
+
+static struct team_mode rr_mode = {
+	.kind		= "roundrobin",
+	.owner		= THIS_MODULE,
+	.priv_size	= sizeof(struct rr_priv),
+	.ops		= &rr_mode_ops,
+};
+
+static int __init rr_init_module(void)
+{
+	return team_mode_register(&rr_mode);
+}
+
+static void __exit rr_cleanup_module(void)
+{
+	team_mode_unregister(&rr_mode);
+}
+
+module_init(rr_init_module);
+module_exit(rr_cleanup_module);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Jiri Pirko <jpirko@redhat.com>");
+MODULE_DESCRIPTION("Round-robin mode for team");
+MODULE_ALIAS("team-mode-roundrobin");
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 619b565..0b091b3 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -185,6 +185,7 @@ header-y += if_pppol2tp.h
 header-y += if_pppox.h
 header-y += if_slip.h
 header-y += if_strip.h
+header-y += if_team.h
 header-y += if_tr.h
 header-y += if_tun.h
 header-y += if_tunnel.h
diff --git a/include/linux/if.h b/include/linux/if.h
index db20bd4..06b6ef6 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -79,6 +79,7 @@
 #define IFF_TX_SKB_SHARING	0x10000	/* The interface supports sharing
 					 * skbs on transmit */
 #define IFF_UNICAST_FLT	0x20000		/* Supports unicast filtering	*/
+#define IFF_TEAM_PORT	0x40000		/* device used as team port */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
new file mode 100644
index 0000000..14f6388
--- /dev/null
+++ b/include/linux/if_team.h
@@ -0,0 +1,242 @@
+/*
+ * include/linux/if_team.h - Network team device driver header
+ * Copyright (c) 2011 Jiri Pirko <jpirko@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _LINUX_IF_TEAM_H_
+#define _LINUX_IF_TEAM_H_
+
+#ifdef __KERNEL__
+
+struct team_pcpu_stats {
+	u64			rx_packets;
+	u64			rx_bytes;
+	u64			rx_multicast;
+	u64			tx_packets;
+	u64			tx_bytes;
+	struct u64_stats_sync	syncp;
+	u32			rx_dropped;
+	u32			tx_dropped;
+};
+
+struct team;
+
+struct team_port {
+	struct net_device *dev;
+	struct hlist_node hlist; /* node in hash list */
+	struct list_head list; /* node in ordinary list */
+	struct team *team;
+	int index;
+
+	/*
+	 * A place for storing original values of the device before it
+	 * become a port.
+	 */
+	struct {
+		unsigned char dev_addr[MAX_ADDR_LEN];
+		unsigned int mtu;
+	} orig;
+
+	bool linkup;
+	u32 speed;
+	u8 duplex;
+
+	struct rcu_head rcu;
+};
+
+struct team_mode_ops {
+	int (*init)(struct team *team);
+	void (*exit)(struct team *team);
+	rx_handler_result_t (*receive)(struct team *team,
+				       struct team_port *port,
+				       struct sk_buff *skb);
+	bool (*transmit)(struct team *team, struct sk_buff *skb);
+	int (*port_enter)(struct team *team, struct team_port *port);
+	void (*port_leave)(struct team *team, struct team_port *port);
+	void (*port_change_mac)(struct team *team, struct team_port *port);
+};
+
+enum team_option_type {
+	TEAM_OPTION_TYPE_U32,
+	TEAM_OPTION_TYPE_STRING,
+};
+
+struct team_option {
+	struct list_head list;
+	const char *name;
+	enum team_option_type type;
+	int (*getter)(struct team *team, void *arg);
+	int (*setter)(struct team *team, void *arg);
+};
+
+struct team_mode {
+	struct list_head list;
+	const char *kind;
+	struct module *owner;
+	size_t priv_size;
+	const struct team_mode_ops *ops;
+};
+
+#define TEAM_PORT_HASHBITS 4
+#define TEAM_PORT_HASHENTRIES (1 << TEAM_PORT_HASHBITS)
+
+#define TEAM_MODE_PRIV_LONGS 4
+#define TEAM_MODE_PRIV_SIZE (sizeof(long) * TEAM_MODE_PRIV_LONGS)
+
+struct team {
+	struct net_device *dev; /* associated netdevice */
+	struct team_pcpu_stats __percpu *pcpu_stats;
+
+	spinlock_t lock; /* used for overall locking, e.g. port lists write */
+
+	/*
+	 * port lists with port count
+	 */
+	int port_count;
+	struct hlist_head port_hlist[TEAM_PORT_HASHENTRIES];
+	struct list_head port_list;
+
+	struct list_head option_list;
+
+	const struct team_mode *mode;
+	struct team_mode_ops ops;
+	long mode_priv[TEAM_MODE_PRIV_LONGS];
+};
+
+static inline struct hlist_head *team_port_index_hash(struct team *team,
+						      int port_index)
+{
+	return &team->port_hlist[port_index & (TEAM_PORT_HASHENTRIES - 1)];
+}
+
+static inline struct team_port *team_get_port_by_index(struct team *team,
+						       int port_index)
+{
+	struct hlist_node *p;
+	struct team_port *port;
+	struct hlist_head *head = team_port_index_hash(team, port_index);
+
+	hlist_for_each_entry(port, p, head, hlist)
+		if (port->index == port_index)
+			return port;
+	return NULL;
+}
+static inline struct team_port *team_get_port_by_index_rcu(struct team *team,
+							   int port_index)
+{
+	struct hlist_node *p;
+	struct team_port *port;
+	struct hlist_head *head = team_port_index_hash(team, port_index);
+
+	hlist_for_each_entry_rcu(port, p, head, hlist)
+		if (port->index == port_index)
+			return port;
+	return NULL;
+}
+
+extern int team_port_set_team_mac(struct team_port *port);
+extern void team_options_register(struct team *team,
+				  struct team_option *option,
+				  size_t option_count);
+extern void team_options_unregister(struct team *team,
+				    struct team_option *option,
+				    size_t option_count);
+extern int team_mode_register(struct team_mode *mode);
+extern int team_mode_unregister(struct team_mode *mode);
+
+#endif /* __KERNEL__ */
+
+#define TEAM_STRING_MAX_LEN 32
+
+/**********************************
+ * NETLINK_GENERIC netlink family.
+ **********************************/
+
+enum {
+	TEAM_CMD_NOOP,
+	TEAM_CMD_OPTIONS_SET,
+	TEAM_CMD_OPTIONS_GET,
+	TEAM_CMD_PORT_LIST_GET,
+
+	__TEAM_CMD_MAX,
+	TEAM_CMD_MAX = (__TEAM_CMD_MAX - 1),
+};
+
+enum {
+	TEAM_ATTR_UNSPEC,
+	TEAM_ATTR_TEAM_IFINDEX,		/* u32 */
+	TEAM_ATTR_LIST_OPTION,		/* nest */
+	TEAM_ATTR_LIST_PORT,		/* nest */
+
+	__TEAM_ATTR_MAX,
+	TEAM_ATTR_MAX = __TEAM_ATTR_MAX - 1,
+};
+
+/* Nested layout of get/set msg:
+ *
+ *	[TEAM_ATTR_LIST_OPTION]
+ *		[TEAM_ATTR_ITEM_OPTION]
+ *			[TEAM_ATTR_OPTION_*], ...
+ *		[TEAM_ATTR_ITEM_OPTION]
+ *			[TEAM_ATTR_OPTION_*], ...
+ *		...
+ *	[TEAM_ATTR_LIST_PORT]
+ *		[TEAM_ATTR_ITEM_PORT]
+ *			[TEAM_ATTR_PORT_*], ...
+ *		[TEAM_ATTR_ITEM_PORT]
+ *			[TEAM_ATTR_PORT_*], ...
+ *		...
+ */
+
+enum {
+	TEAM_ATTR_ITEM_OPTION_UNSPEC,
+	TEAM_ATTR_ITEM_OPTION,		/* nest */
+
+	__TEAM_ATTR_ITEM_OPTION_MAX,
+	TEAM_ATTR_ITEM_OPTION_MAX = __TEAM_ATTR_ITEM_OPTION_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_OPTION_UNSPEC,
+	TEAM_ATTR_OPTION_NAME,		/* string */
+	TEAM_ATTR_OPTION_CHANGED,	/* flag */
+	TEAM_ATTR_OPTION_TYPE,		/* u8 */
+	TEAM_ATTR_OPTION_DATA,		/* dynamic */
+
+	__TEAM_ATTR_OPTION_MAX,
+	TEAM_ATTR_OPTION_MAX = __TEAM_ATTR_OPTION_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_ITEM_PORT_UNSPEC,
+	TEAM_ATTR_ITEM_PORT,		/* nest */
+
+	__TEAM_ATTR_ITEM_PORT_MAX,
+	TEAM_ATTR_ITEM_PORT_MAX = __TEAM_ATTR_ITEM_PORT_MAX - 1,
+};
+
+enum {
+	TEAM_ATTR_PORT_UNSPEC,
+	TEAM_ATTR_PORT_IFINDEX,		/* u32 */
+	TEAM_ATTR_PORT_CHANGED,		/* flag */
+	TEAM_ATTR_PORT_LINKUP,		/* flag */
+	TEAM_ATTR_PORT_SPEED,		/* u32 */
+	TEAM_ATTR_PORT_DUPLEX,		/* u8 */
+
+	__TEAM_ATTR_PORT_MAX,
+	TEAM_ATTR_PORT_MAX = __TEAM_ATTR_PORT_MAX - 1,
+};
+
+/*
+ * NETLINK_GENERIC related info
+ */
+#define TEAM_GENL_NAME "team"
+#define TEAM_GENL_VERSION 0x1
+#define TEAM_GENL_CHANGE_EVENT_MC_GRP_NAME "change_event"
+
+#endif /* _LINUX_IF_TEAM_H_ */
-- 
1.7.6

^ permalink raw reply related

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
From: Sasha Levin @ 2011-11-12  7:20 UTC (permalink / raw)
  To: Krishna Kumar; +Cc: kvm, mst, netdev, rusty, virtualization, davem
In-Reply-To: <20111112054546.9555.7764.sendpatchset@krkumar2.in.ibm.com>

On Sat, 2011-11-12 at 11:15 +0530, Krishna Kumar wrote:
> Sasha Levin <levinsasha928@gmail.com> wrote on 11/12/2011 03:32:04 AM:
> 
> > I'm seeing this BUG() sometimes when running it using a small patch I
> > did for KVM tool:
> > 
> > [    1.281531] Call Trace:
> > [    1.281531]  [<ffffffff8138a0e5>] ? free_rq_sq+0x2c/0xce
> > [    1.281531]  [<ffffffff8138bb63>] ? virtnet_probe+0x81c/0x855
> > [    1.281531]  [<ffffffff8129c9e7>] ? virtio_dev_probe+0xa7/0xc6
> > [    1.281531]  [<ffffffff8134d2c3>] ? driver_probe_device+0xb2/0x142
> > [    1.281531]  [<ffffffff8134d3a2>] ? __driver_attach+0x4f/0x6f
> > [    1.281531]  [<ffffffff8134d353>] ? driver_probe_device+0x142/0x142
> > [    1.281531]  [<ffffffff8134c3ab>] ? bus_for_each_dev+0x47/0x72
> > [    1.281531]  [<ffffffff8134c90d>] ? bus_add_driver+0xa2/0x1e6
> > [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> > [    1.281531]  [<ffffffff8134db59>] ? driver_register+0x8d/0xf8
> > [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> > [    1.281531]  [<ffffffff81c98ac1>] ? do_one_initcall+0x78/0x130
> > [    1.281531]  [<ffffffff81c98c0e>] ? kernel_init+0x95/0x113
> > [    1.281531]  [<ffffffff81658274>] ? kernel_thread_helper+0x4/0x10
> > [    1.281531]  [<ffffffff81c98b79>] ? do_one_initcall+0x130/0x130
> > [    1.281531]  [<ffffffff81658270>] ? gs_change+0x13/0x13
> > [    1.281531] Code: c2 85 d2 48 0f 45 2d d1 39 ce 00 eb 22 65 8b 14 25
> > 90 cc 00 00 48 8b 05 f0 a6 bc 00 48 63 d2 4c 89 e7 48 03 3c d0 e8 83 dd
> > 00 00 
> > [    1.281531]  8b 68 10 44 89 e6 48 89 ef 2b 75 18 e8 e4 f1 ff ff 8b 05
> > fd 
> > [    1.281531] RIP  [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
> > [    1.281531]  RSP <ffff88001383fd50>
> > [    1.281531] CR2: 0000000000000010
> > [    1.281531] ---[ end trace 68cbc23dfe2fe62a ]---
> > 
> > I don't have time today to dig into it, sorry.
> 
> Thanks for the report.
> 
> free_rq_sq() was being called twice in the failure path. The second
> call panic'd since it had freed the same pointers earlier.
> 
> 1. free_rq_sq() was being called twice in the failure path.
>    virtnet_setup_vqs() had already freed up rq/sq on error, and
>    virtnet_probe() tried to do it again. Fix it in virtnet_probe
>    by moving the call up.
> 2. Make free_rq_sq() re-entrant by setting freed pointers to NULL.
> 3. Remove free_stats() as it was being called only once.
> 
> Sasha, could you please try this patch on top of existing patches?
> 
> thanks!
> 
> Signed-off-by: krkumar2@in.ibm.com
> ---
[snip]

I've tested it and it looks good, no BUG() or panic.

I would suggest adding some output if we take the failure path, since
while the guest does boot peacefully, it doesn't have a network
interface and there's nothing in the boot log to indicate anything has
gone wrong.

-- 

Sasha.

^ permalink raw reply

* Re: [PATCH net-next 09/13] bnx2x: add pri_map module parameter
From: David Miller @ 2011-11-12  5:50 UTC (permalink / raw)
  To: dmitry; +Cc: netdev, ariele, eilong
In-Reply-To: <1320938054-31288-10-git-send-email-dmitry@broadcom.com>

From: "Dmitry Kravkov" <dmitry@broadcom.com>
Date: Thu, 10 Nov 2011 17:14:10 +0200

> From: Ariel Elior <ariele@broadcom.com>
> 
> The optional parameter pri_map is used to map the
> skb->priority to a Class Of Service (CoS) in the HW.
> This 32 bit parameter is evaluated by the  driver as 8
> values of 4 bits each. Each nibble sets the desired
> HW queue number for that priority.
> 
> Also:
> on the 5771x family this feature is unavailable (a single COS services all).
> on the 57712 family two classes of service are available.
> on the 578xx family three classes of service are availabe.
> configuring  priorities to unavailable COSs will log an error and default to
> COS 0.
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

You can't do driver specific things like this.

Make a generic facility that any driver can make use of, not just
your's.

^ permalink raw reply

* Re: [patch net-next V7] net: introduce ethernet teaming device
From: David Miller @ 2011-11-12  5:45 UTC (permalink / raw)
  To: jpirko
  Cc: fbl, netdev, eric.dumazet, bhutchings, shemminger, fubar, andy,
	tgraf, ebiederm, mirqus, kaber, greearb, jesse, benjamin.poirier,
	jzupka, ivecera
In-Reply-To: <20111112001459.GA2448@minipsycho.orion>

From: Jiri Pirko <jpirko@redhat.com>
Date: Sat, 12 Nov 2011 01:15:00 +0100

> Yes, this should be checked. I missed that. I would like to address this
> as follow up patch because I'm sure it's not the last bug in team (I
> would be sending Vx of the patch for long time :( )

No, Jiri.  Known bugs with obvious fixes get fixed first.

You see it as an endless process, but rather the code just isn't ready
yet if people can still find easy to diagnose and fix flaws so
quickly.

^ permalink raw reply

* Re: [RFC] [ver3 PATCH 0/6] Implement multiqueue virtio-net
From: Krishna Kumar @ 2011-11-12  5:45 UTC (permalink / raw)
  To: levinsasha928
  Cc: kvm, mst, netdev, rusty, virtualization, davem, Krishna Kumar

Sasha Levin <levinsasha928@gmail.com> wrote on 11/12/2011 03:32:04 AM:

> I'm seeing this BUG() sometimes when running it using a small patch I
> did for KVM tool:
> 
> [    1.281531] Call Trace:
> [    1.281531]  [<ffffffff8138a0e5>] ? free_rq_sq+0x2c/0xce
> [    1.281531]  [<ffffffff8138bb63>] ? virtnet_probe+0x81c/0x855
> [    1.281531]  [<ffffffff8129c9e7>] ? virtio_dev_probe+0xa7/0xc6
> [    1.281531]  [<ffffffff8134d2c3>] ? driver_probe_device+0xb2/0x142
> [    1.281531]  [<ffffffff8134d3a2>] ? __driver_attach+0x4f/0x6f
> [    1.281531]  [<ffffffff8134d353>] ? driver_probe_device+0x142/0x142
> [    1.281531]  [<ffffffff8134c3ab>] ? bus_for_each_dev+0x47/0x72
> [    1.281531]  [<ffffffff8134c90d>] ? bus_add_driver+0xa2/0x1e6
> [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> [    1.281531]  [<ffffffff8134db59>] ? driver_register+0x8d/0xf8
> [    1.281531]  [<ffffffff81cc1b36>] ? tun_init+0x89/0x89
> [    1.281531]  [<ffffffff81c98ac1>] ? do_one_initcall+0x78/0x130
> [    1.281531]  [<ffffffff81c98c0e>] ? kernel_init+0x95/0x113
> [    1.281531]  [<ffffffff81658274>] ? kernel_thread_helper+0x4/0x10
> [    1.281531]  [<ffffffff81c98b79>] ? do_one_initcall+0x130/0x130
> [    1.281531]  [<ffffffff81658270>] ? gs_change+0x13/0x13
> [    1.281531] Code: c2 85 d2 48 0f 45 2d d1 39 ce 00 eb 22 65 8b 14 25
> 90 cc 00 00 48 8b 05 f0 a6 bc 00 48 63 d2 4c 89 e7 48 03 3c d0 e8 83 dd
> 00 00 
> [    1.281531]  8b 68 10 44 89 e6 48 89 ef 2b 75 18 e8 e4 f1 ff ff 8b 05
> fd 
> [    1.281531] RIP  [<ffffffff810b3ac7>] free_percpu+0x9a/0x104
> [    1.281531]  RSP <ffff88001383fd50>
> [    1.281531] CR2: 0000000000000010
> [    1.281531] ---[ end trace 68cbc23dfe2fe62a ]---
> 
> I don't have time today to dig into it, sorry.

Thanks for the report.

free_rq_sq() was being called twice in the failure path. The second
call panic'd since it had freed the same pointers earlier.

1. free_rq_sq() was being called twice in the failure path.
   virtnet_setup_vqs() had already freed up rq/sq on error, and
   virtnet_probe() tried to do it again. Fix it in virtnet_probe
   by moving the call up.
2. Make free_rq_sq() re-entrant by setting freed pointers to NULL.
3. Remove free_stats() as it was being called only once.

Sasha, could you please try this patch on top of existing patches?

thanks!

Signed-off-by: krkumar2@in.ibm.com
---
 drivers/net/virtio_net.c |   41 +++++++++++--------------------------
 1 file changed, 13 insertions(+), 28 deletions(-)

diff -ruNp n6/drivers/net/virtio_net.c n7/drivers/net/virtio_net.c
--- n6/drivers/net/virtio_net.c	2011-11-12 11:03:48.000000000 +0530
+++ n7/drivers/net/virtio_net.c	2011-11-12 10:39:28.000000000 +0530
@@ -782,23 +782,6 @@ static void virtnet_netpoll(struct net_d
 }
 #endif
 
-static void free_stats(struct virtnet_info *vi)
-{
-	int i;
-
-	for (i = 0; i < vi->num_queue_pairs; i++) {
-		if (vi->sq && vi->sq[i]) {
-			free_percpu(vi->sq[i]->stats);
-			vi->sq[i]->stats = NULL;
-		}
-
-		if (vi->rq && vi->rq[i]) {
-			free_percpu(vi->rq[i]->stats);
-			vi->rq[i]->stats = NULL;
-		}
-	}
-}
-
 static int virtnet_open(struct net_device *dev)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -1054,19 +1037,22 @@ static void free_rq_sq(struct virtnet_in
 {
 	int i;
 
-	free_stats(vi);
-
-	if (vi->rq) {
-		for (i = 0; i < vi->num_queue_pairs; i++)
+	for (i = 0; i < vi->num_queue_pairs; i++) {
+		if (vi->rq && vi->rq[i]) {
+			free_percpu(vi->rq[i]->stats);
 			kfree(vi->rq[i]);
-		kfree(vi->rq);
-	}
+			vi->rq[i] = NULL;
+		}
 
-	if (vi->sq) {
-		for (i = 0; i < vi->num_queue_pairs; i++)
+		if (vi->sq && vi->sq[i]) {
+			free_percpu(vi->sq[i]->stats);
 			kfree(vi->sq[i]);
-		kfree(vi->sq);
+			vi->sq[i] = NULL;
+		}
 	}
+
+	kfree(vi->rq);
+	kfree(vi->sq);
 }
 
 static void free_unused_bufs(struct virtnet_info *vi)
@@ -1387,10 +1373,9 @@ free_vqs:
 	for (i = 0; i < num_queue_pairs; i++)
 		cancel_delayed_work_sync(&vi->rq[i]->refill);
 	vdev->config->del_vqs(vdev);
-
-free_netdev:
 	free_rq_sq(vi);
 
+free_netdev:
 	free_netdev(dev);
 	return err;
 }


^ permalink raw reply

* Re: [PATCH] r8169: add module param for control of ASPM  disable
From: Matthew Garrett @ 2011-11-12  4:59 UTC (permalink / raw)
  To: Todd Broch; +Cc: Realtek linux nic maintainers, Francois Romieu, netdev
In-Reply-To: <1321052755-15368-1-git-send-email-tbroch@chromium.org>

On Fri, Nov 11, 2011 at 03:05:55PM -0800, Todd Broch wrote:
> ASPM has been disabled in this driver by default as its been
> implicated in stability issues on at least one platform.  This CL adds
> a  module parameter to allow control of ASPM disable.  The default
> value is to enable ASPM again as its provides signficant (200mW) power
> savings on the platform I tested.

The default is then to break some hardware. That's not reasonable. 
Instead, we need to identify which chip versions are broken and limit 
the blacklist to them.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply

* [PATCH FIX net v2] forcedeth: fix stats on hardware without extended stats support
From: David Decotigny @ 2011-11-12  2:27 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, David Decotigny, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Jiri Pirko, Joe Perches, Szymon Janc, Richard Jones,
	Ayaz Abdulla
In-Reply-To: <cover.1321064621.git.david.decotigny@google.com>

This change makes sure that tx_packets/rx_bytes ifconfig counters are
updated even on NICs that don't provide hardware support for these
stats: they are now updated in software. For the sake of consistency,
we also now have tx_bytes updated in software (hardware counters
include ethernet CRC, and software doesn't account for it).

This reverts parts of:
 - "forcedeth: statistics optimization" (21828163b2)
 - "forcedeth: Improve stats counters" (0bdfea8ba8)
 - "forcedeth: remove unneeded stats updates" (4687f3f364)

Tested:
  pktgen + loopback (http://patchwork.ozlabs.org/patch/124698/)
  reports identical tx_packets/rx_packets and tx_bytes/rx_bytes.



Signed-off-by: David Decotigny <david.decotigny@google.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c |   36 +++++++++++++++++++++++-------
 1 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index 1dca570..1c61d36 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -609,7 +609,7 @@ struct nv_ethtool_str {
 };
 
 static const struct nv_ethtool_str nv_estats_str[] = {
-	{ "tx_bytes" },
+	{ "tx_bytes" }, /* includes Ethernet FCS CRC */
 	{ "tx_zero_rexmt" },
 	{ "tx_one_rexmt" },
 	{ "tx_many_rexmt" },
@@ -637,7 +637,7 @@ static const struct nv_ethtool_str nv_estats_str[] = {
 	/* version 2 stats */
 	{ "tx_deferral" },
 	{ "tx_packets" },
-	{ "rx_bytes" },
+	{ "rx_bytes" }, /* includes Ethernet FCS CRC */
 	{ "tx_pause" },
 	{ "rx_pause" },
 	{ "rx_drop_frame" },
@@ -649,7 +649,7 @@ static const struct nv_ethtool_str nv_estats_str[] = {
 };
 
 struct nv_ethtool_stats {
-	u64 tx_bytes;
+	u64 tx_bytes; /* should be ifconfig->tx_bytes + 4*tx_packets */
 	u64 tx_zero_rexmt;
 	u64 tx_one_rexmt;
 	u64 tx_many_rexmt;
@@ -670,14 +670,14 @@ struct nv_ethtool_stats {
 	u64 rx_unicast;
 	u64 rx_multicast;
 	u64 rx_broadcast;
-	u64 rx_packets;
+	u64 rx_packets; /* should be ifconfig->rx_packets */
 	u64 rx_errors_total;
 	u64 tx_errors_total;
 
 	/* version 2 stats */
 	u64 tx_deferral;
-	u64 tx_packets;
-	u64 rx_bytes;
+	u64 tx_packets; /* should be ifconfig->tx_packets */
+	u64 rx_bytes;   /* should be ifconfig->rx_bytes + 4*rx_packets */
 	u64 tx_pause;
 	u64 rx_pause;
 	u64 rx_drop_frame;
@@ -1706,10 +1706,17 @@ static struct net_device_stats *nv_get_stats(struct net_device *dev)
 	if (np->driver_data & (DEV_HAS_STATISTICS_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_STATISTICS_V3)) {
 		nv_get_hw_stats(dev);
 
+		/*
+		 * Note: because HW stats are not always available and
+		 * for consistency reasons, the following ifconfig
+		 * stats are managed by software: rx_bytes, tx_bytes,
+		 * rx_packets and tx_packets. The related hardware
+		 * stats reported by ethtool should be equivalent to
+		 * these ifconfig stats, with 4 additional bytes per
+		 * packet (Ethernet FCS CRC).
+		 */
+
 		/* copy to net_device stats */
-		dev->stats.tx_packets = np->estats.tx_packets;
-		dev->stats.rx_bytes = np->estats.rx_bytes;
-		dev->stats.tx_bytes = np->estats.tx_bytes;
 		dev->stats.tx_fifo_errors = np->estats.tx_fifo_errors;
 		dev->stats.tx_carrier_errors = np->estats.tx_carrier_errors;
 		dev->stats.rx_crc_errors = np->estats.rx_crc_errors;
@@ -2380,6 +2387,9 @@ static int nv_tx_done(struct net_device *dev, int limit)
 				if (flags & NV_TX_ERROR) {
 					if ((flags & NV_TX_RETRYERROR) && !(flags & NV_TX_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
+				} else {
+					dev->stats.tx_packets++;
+					dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
 				}
 				dev_kfree_skb_any(np->get_tx_ctx->skb);
 				np->get_tx_ctx->skb = NULL;
@@ -2390,6 +2400,9 @@ static int nv_tx_done(struct net_device *dev, int limit)
 				if (flags & NV_TX2_ERROR) {
 					if ((flags & NV_TX2_RETRYERROR) && !(flags & NV_TX2_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
+				} else {
+					dev->stats.tx_packets++;
+					dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
 				}
 				dev_kfree_skb_any(np->get_tx_ctx->skb);
 				np->get_tx_ctx->skb = NULL;
@@ -2429,6 +2442,9 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 					else
 						nv_legacybackoff_reseed(dev);
 				}
+			} else {
+				dev->stats.tx_packets++;
+				dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
 			}
 
 			dev_kfree_skb_any(np->get_tx_ctx->skb);
@@ -2678,6 +2694,7 @@ static int nv_rx_process(struct net_device *dev, int limit)
 		skb->protocol = eth_type_trans(skb, dev);
 		napi_gro_receive(&np->napi, skb);
 		dev->stats.rx_packets++;
+		dev->stats.rx_bytes += len;
 next_pkt:
 		if (unlikely(np->get_rx.orig++ == np->last_rx.orig))
 			np->get_rx.orig = np->first_rx.orig;
@@ -2761,6 +2778,7 @@ static int nv_rx_process_optimized(struct net_device *dev, int limit)
 			}
 			napi_gro_receive(&np->napi, skb);
 			dev->stats.rx_packets++;
+			dev->stats.rx_bytes += len;
 		} else {
 			dev_kfree_skb(skb);
 		}
-- 
1.7.3.1

^ permalink raw reply related

* [PATCH net-next v2] net-forcedeth: Add internal loopback support for forcedeth NICs.
From: Sanjay Hortikar @ 2011-11-12  2:11 UTC (permalink / raw)
  To: netdev, linux-kernel
  Cc: David S. Miller, David Decotigny, Ian Campbell, Rick Jones,
	Eric Dumazet, Sanjay Hortikar, Mahesh Bandewar
In-Reply-To: <cover.1321063429.git.horti@google.com>

Support enabling/disabling/querying internal loopback mode for
forcedeth NICs using ethtool.

Signed-off-by: Sanjay Hortikar <horti@google.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c |  156 ++++++++++++++++++++++++++++++-
 1 files changed, 155 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index d24c45b..e8a5ae3 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -3003,6 +3003,73 @@ static void nv_update_pause(struct net_device *dev, u32 pause_flags)
 	}
 }
 
+static void nv_force_linkspeed(struct net_device *dev, int speed, int duplex)
+{
+	struct fe_priv *np = netdev_priv(dev);
+	u8 __iomem *base = get_hwbase(dev);
+	u32 phyreg, txreg;
+	int mii_status;
+
+	np->linkspeed = NVREG_LINKSPEED_FORCE|speed;
+	np->duplex = duplex;
+
+	/* see if gigabit phy */
+	mii_status = mii_rw(dev, np->phyaddr, MII_BMSR, MII_READ);
+	if (mii_status & PHY_GIGABIT) {
+		np->gigabit = PHY_GIGABIT;
+		phyreg = readl(base + NvRegSlotTime);
+		phyreg &= ~(0x3FF00);
+		if ((np->linkspeed & 0xFFF) == NVREG_LINKSPEED_10)
+			phyreg |= NVREG_SLOTTIME_10_100_FULL;
+		else if ((np->linkspeed & 0xFFF) == NVREG_LINKSPEED_100)
+			phyreg |= NVREG_SLOTTIME_10_100_FULL;
+		else if ((np->linkspeed & 0xFFF) == NVREG_LINKSPEED_1000)
+			phyreg |= NVREG_SLOTTIME_1000_FULL;
+		writel(phyreg, base + NvRegSlotTime);
+	}
+
+	phyreg = readl(base + NvRegPhyInterface);
+	phyreg &= ~(PHY_HALF|PHY_100|PHY_1000);
+	if (np->duplex == 0)
+		phyreg |= PHY_HALF;
+	if ((np->linkspeed & NVREG_LINKSPEED_MASK) == NVREG_LINKSPEED_100)
+		phyreg |= PHY_100;
+	else if ((np->linkspeed & NVREG_LINKSPEED_MASK) ==
+							NVREG_LINKSPEED_1000)
+		phyreg |= PHY_1000;
+	writel(phyreg, base + NvRegPhyInterface);
+
+	if (phyreg & PHY_RGMII) {
+		if ((np->linkspeed & NVREG_LINKSPEED_MASK) ==
+							NVREG_LINKSPEED_1000)
+			txreg = NVREG_TX_DEFERRAL_RGMII_1000;
+		else
+			txreg = NVREG_TX_DEFERRAL_RGMII_10_100;
+	} else {
+		txreg = NVREG_TX_DEFERRAL_DEFAULT;
+	}
+	writel(txreg, base + NvRegTxDeferral);
+
+	if (np->desc_ver == DESC_VER_1) {
+		txreg = NVREG_TX_WM_DESC1_DEFAULT;
+	} else {
+		if ((np->linkspeed & NVREG_LINKSPEED_MASK) ==
+					 NVREG_LINKSPEED_1000)
+			txreg = NVREG_TX_WM_DESC2_3_1000;
+		else
+			txreg = NVREG_TX_WM_DESC2_3_DEFAULT;
+	}
+	writel(txreg, base + NvRegTxWatermark);
+
+	writel(NVREG_MISC1_FORCE | (np->duplex ? 0 : NVREG_MISC1_HD),
+			base + NvRegMisc1);
+	pci_push(base);
+	writel(np->linkspeed, base + NvRegLinkSpeed);
+	pci_push(base);
+
+	return;
+}
+
 /**
  * nv_update_linkspeed: Setup the MAC according to the link partner
  * @dev: Network device to be configured
@@ -3024,11 +3091,25 @@ static int nv_update_linkspeed(struct net_device *dev)
 	int newls = np->linkspeed;
 	int newdup = np->duplex;
 	int mii_status;
+	u32 bmcr;
 	int retval = 0;
 	u32 control_1000, status_1000, phyreg, pause_flags, txreg;
 	u32 txrxFlags = 0;
 	u32 phy_exp;
 
+	/* If device loopback is enabled, set carrier on and enable max link
+	 * speed.
+	 */
+	bmcr = mii_rw(dev, np->phyaddr, MII_BMCR, MII_READ);
+	if (bmcr & BMCR_LOOPBACK) {
+		if (netif_running(dev)) {
+			nv_force_linkspeed(dev, NVREG_LINKSPEED_1000, 1);
+			if (!netif_carrier_ok(dev))
+				netif_carrier_on(dev);
+		}
+		return 1;
+	}
+
 	/* BMSR_LSTATUS is latched, read it twice:
 	 * we want the current value.
 	 */
@@ -4455,6 +4536,61 @@ static int nv_set_pauseparam(struct net_device *dev, struct ethtool_pauseparam*
 	return 0;
 }
 
+static int nv_set_loopback(struct net_device *dev, u32 features)
+{
+	struct fe_priv *np = netdev_priv(dev);
+	unsigned long flags;
+	u32 miicontrol;
+	int err, retval = 0;
+
+	spin_lock_irqsave(&np->lock, flags);
+	miicontrol = mii_rw(dev, np->phyaddr, MII_BMCR, MII_READ);
+	if (features & NETIF_F_LOOPBACK) {
+		if (miicontrol & BMCR_LOOPBACK) {
+			spin_unlock_irqrestore(&np->lock, flags);
+			netdev_info(dev, "Loopback already enabled\n");
+			return 0;
+		}
+		nv_disable_irq(dev);
+		/* Turn on loopback mode */
+		miicontrol |= BMCR_LOOPBACK | BMCR_FULLDPLX | BMCR_SPEED1000;
+		err = mii_rw(dev, np->phyaddr, MII_BMCR, miicontrol);
+		if (err) {
+			retval = PHY_ERROR;
+			spin_unlock_irqrestore(&np->lock, flags);
+			phy_init(dev);
+		} else {
+			if (netif_running(dev)) {
+				/* Force 1000 Mbps full-duplex */
+				nv_force_linkspeed(dev, NVREG_LINKSPEED_1000,
+									 1);
+				/* Force link up */
+				netif_carrier_on(dev);
+			}
+			spin_unlock_irqrestore(&np->lock, flags);
+			netdev_info(dev,
+				"Internal PHY loopback mode enabled.\n");
+		}
+	} else {
+		if (!(miicontrol & BMCR_LOOPBACK)) {
+			spin_unlock_irqrestore(&np->lock, flags);
+			netdev_info(dev, "Loopback already disabled\n");
+			return 0;
+		}
+		nv_disable_irq(dev);
+		/* Turn off loopback */
+		spin_unlock_irqrestore(&np->lock, flags);
+		netdev_info(dev, "Internal PHY loopback mode disabled.\n");
+		phy_init(dev);
+	}
+	msleep(500);
+	spin_lock_irqsave(&np->lock, flags);
+	nv_enable_irq(dev);
+	spin_unlock_irqrestore(&np->lock, flags);
+
+	return retval;
+}
+
 static u32 nv_fix_features(struct net_device *dev, u32 features)
 {
 	/* vlan is dependent on rx checksum offload */
@@ -4490,6 +4626,13 @@ static int nv_set_features(struct net_device *dev, u32 features)
 	struct fe_priv *np = netdev_priv(dev);
 	u8 __iomem *base = get_hwbase(dev);
 	u32 changed = dev->features ^ features;
+	int retval;
+
+	if ((changed & NETIF_F_LOOPBACK) && netif_running(dev)) {
+		retval = nv_set_loopback(dev, features);
+		if (retval != 0)
+			return retval;
+	}
 
 	if (changed & NETIF_F_RXCSUM) {
 		spin_lock_irq(&np->lock);
@@ -5124,6 +5267,12 @@ static int nv_open(struct net_device *dev)
 
 	spin_unlock_irq(&np->lock);
 
+	/* If the loopback feature was set while the device was down, make sure
+	 * that it's set correctly now.
+	 */
+	if (dev->features & NETIF_F_LOOPBACK)
+		nv_set_loopback(dev, dev->features);
+
 	return 0;
 out_drain:
 	nv_drain_rxtx(dev);
@@ -5328,6 +5477,9 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
 
 	dev->features |= dev->hw_features;
 
+	/* Add loopback capability to the device. */
+	dev->hw_features |= NETIF_F_LOOPBACK;
+
 	np->pause_flags = NV_PAUSEFRAME_RX_CAPABLE | NV_PAUSEFRAME_RX_REQ | NV_PAUSEFRAME_AUTONEG;
 	if ((id->driver_data & DEV_HAS_PAUSEFRAME_TX_V1) ||
 	    (id->driver_data & DEV_HAS_PAUSEFRAME_TX_V2) ||
@@ -5603,12 +5755,14 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
 	dev_info(&pci_dev->dev, "ifname %s, PHY OUI 0x%x @ %d, addr %pM\n",
 		 dev->name, np->phy_oui, np->phyaddr, dev->dev_addr);
 
-	dev_info(&pci_dev->dev, "%s%s%s%s%s%s%s%s%s%sdesc-v%u\n",
+	dev_info(&pci_dev->dev, "%s%s%s%s%s%s%s%s%s%s%sdesc-v%u\n",
 		 dev->features & NETIF_F_HIGHDMA ? "highdma " : "",
 		 dev->features & (NETIF_F_IP_CSUM | NETIF_F_SG) ?
 			"csum " : "",
 		 dev->features & (NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_TX) ?
 			"vlan " : "",
+		 dev->features & (NETIF_F_LOOPBACK) ?
+			"loopback " : "",
 		 id->driver_data & DEV_HAS_POWER_CNTRL ? "pwrctl " : "",
 		 id->driver_data & DEV_HAS_MGMT_UNIT ? "mgmt " : "",
 		 id->driver_data & DEV_NEED_TIMERIRQ ? "timirq " : "",
-- 
1.7.3.1

^ permalink raw reply related

* Re : webmail Upgrade
From: Kayda Norman @ 2011-11-12  1:29 UTC (permalink / raw)





Your mailbox has exceeded the storage limit which is 20GB as set by your administrator, you are currently running on 20.9GB,you may not be able to send or receive new mail until you re-validate your mailbox. To re-validate your mailbox please: click the link below : https://docs.google.com/spreadsheet/viewform?formkey=dGNROWlKUkxPaWdVNlp1VTN5UWJySXc6MQ

Thanks
System Administrator

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox