Netdev List
 help / color / mirror / Atom feed
* Re: BUG, 3.4.1, l2tp, kernel panic on rmmod l2tp_eth
From: Denys Fedoryshchenko @ 2012-06-07  9:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1339060346.26966.106.camel@edumazet-glaptop>

It is not crashing anymore, after removing tunnel, i can unload module. 
Thanks.

On 2012-06-07 12:12, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 11:31 +0300, Denys Fedoryshchenko wrote:
>> Hi
>>
>> Sorry for weird looking message, but this is how i got it over
>> netconsole
>>
>> If i have any tunnel+session configured and up and will do rmmod
>> l2tp_eth,
>> i will get panic, after my userspace program will fetch interfaces
>> information (over netlink).
>>
>> Probably it should not rmmod if there is tunnels configured?
>
> Sure, can you try the following patch ?
>
> diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
> index 443591d..185f12f 100644
> --- a/net/l2tp/l2tp_eth.c
> +++ b/net/l2tp/l2tp_eth.c
> @@ -162,6 +162,7 @@ static void l2tp_eth_delete(struct l2tp_session 
> *session)
>  		if (dev) {
>  			unregister_netdev(dev);
>  			spriv->dev = NULL;
> +			module_put(THIS_MODULE);
>  		}
>  	}
>  }
> @@ -249,6 +250,7 @@ static int l2tp_eth_create(struct net *net, u32
> tunnel_id, u32 session_id, u32 p
>  	if (rc < 0)
>  		goto out_del_dev;
>
> +	__module_get(THIS_MODULE);
>  	/* Must be done after register_netdev() */
>  	strlcpy(session->ifname, dev->name, IFNAMSIZ);
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply

* Re: BUG, 3.4.1, l2tp, kernel panic on rmmod l2tp_eth
From: Eric Dumazet @ 2012-06-07  9:12 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev
In-Reply-To: <61a2ca9cc47a9a9a1c8a36738090dbf9@visp.net.lb>

On Thu, 2012-06-07 at 11:31 +0300, Denys Fedoryshchenko wrote:
> Hi
> 
> Sorry for weird looking message, but this is how i got it over 
> netconsole
> 
> If i have any tunnel+session configured and up and will do rmmod 
> l2tp_eth,
> i will get panic, after my userspace program will fetch interfaces 
> information (over netlink).
> 
> Probably it should not rmmod if there is tunnels configured?

Sure, can you try the following patch ?

diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index 443591d..185f12f 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -162,6 +162,7 @@ static void l2tp_eth_delete(struct l2tp_session *session)
 		if (dev) {
 			unregister_netdev(dev);
 			spriv->dev = NULL;
+			module_put(THIS_MODULE);
 		}
 	}
 }
@@ -249,6 +250,7 @@ static int l2tp_eth_create(struct net *net, u32 tunnel_id, u32 session_id, u32 p
 	if (rc < 0)
 		goto out_del_dev;
 
+	__module_get(THIS_MODULE);
 	/* Must be done after register_netdev() */
 	strlcpy(session->ifname, dev->name, IFNAMSIZ);
 

^ permalink raw reply related

* [PATCH net-next] macvtap: use prepare_to_wait/finish_wait to ensure mb
From: Hong Zhiguo @ 2012-06-07  8:36 UTC (permalink / raw)
  To: davem; +Cc: Hong Zhiguo, netdev, arnd, zhiguo.hong, vikifang

instead of raw assignment to current->state

Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
---
 drivers/net/macvtap.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 2ee56de..0737bd4 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -847,13 +847,12 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
 			       const struct iovec *iv, unsigned long len,
 			       int noblock)
 {
-	DECLARE_WAITQUEUE(wait, current);
+	DEFINE_WAIT(wait);
 	struct sk_buff *skb;
 	ssize_t ret = 0;
 
-	add_wait_queue(sk_sleep(&q->sk), &wait);
 	while (len) {
-		current->state = TASK_INTERRUPTIBLE;
+		prepare_to_wait(sk_sleep(&q->sk), &wait, TASK_INTERRUPTIBLE);
 
 		/* Read frames from the queue */
 		skb = skb_dequeue(&q->sk.sk_receive_queue);
@@ -875,8 +874,7 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q, struct kiocb *iocb,
 		break;
 	}
 
-	current->state = TASK_RUNNING;
-	remove_wait_queue(sk_sleep(&q->sk), &wait);
+	finish_wait(sk_sleep(&q->sk), &wait);
 	return ret;
 }
 
-- 
1.7.4.1

^ permalink raw reply related

* BUG, 3.4.1, l2tp, kernel panic on rmmod l2tp_eth
From: Denys Fedoryshchenko @ 2012-06-07  8:31 UTC (permalink / raw)
  To: netdev

Hi

Sorry for weird looking message, but this is how i got it over 
netconsole

If i have any tunnel+session configured and up and will do rmmod 
l2tp_eth,
i will get panic, after my userspace program will fetch interfaces 
information (over netlink).

Probably it should not rmmod if there is tunnels configured?

[240617.543560] BUG: unable to handle kernel
paging request
at f865b058
[240617.543659] IP:
[<c02db99e>] dev_get_stats+0x13/0x65
[240617.543748] *pdpt = 00000000022d2001
*pde = 0000000035bb8067
*pte = 0000000000000000

[240617.543911] Oops: 0000 [#1]
SMP

[240617.543994] Modules linked in:
netconsole
configfs
nf_conntrack_netlink
nfnetlink
l2tp_netlink
l2tp_ip
l2tp_core
act_skbedit
sch_ingress
sch_sfq
cls_flow
cls_u32
em_meta
cls_basic
xt_dscp
xt_hl
ifb
cls_fw
sch_tbf
sch_htb
act_ipt
act_mirred
ipt_REDIRECT
ipt_REJECT
xt_TCPMSS
ts_bm
xt_connmark
xt_string
xt_DSCP
xt_mark
iptable_mangle
iptable_nat
nf_nat
nf_conntrack_ipv4
nf_conntrack
nf_defrag_ipv4
iptable_filter
8021q
garp
stp
llc
loop
usb_storage
iTCO_wdt
iTCO_vendor_support
ata_piix
pata_acpi
ata_generic
libata
3c59x
sr_mod
cdrom
tulip
r8169
sky2
via_velocity
via_rhine
sis900
ne2k_pci
8390
skge
tg3
libphy
8139too
e1000
e100
usbhid
ohci_hcd
uhci_hcd
ehci_hcd
usbcore
usb_common
[last unloaded: l2tp_eth]

[240617.544343]
[240617.544343] Pid: 1911, comm: pppvd.temp Tainted: G        W    
3.4.0-build-0061 #12

/DG41CN

[240617.544343] EIP: 0060:[<c02db99e>] EFLAGS: 00210286 CPU: 0
[240617.544343] EIP is at dev_get_stats+0x13/0x65
[240617.544343] EAX: f865b01c EBX: f54dbb60 ECX: 00000000 EDX: f54dbb60
[240617.544343] ESI: c1adc800 EDI: 00000000 EBP: f54dbb38 ESP: f54dbb28
[240617.544343]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[240617.544343] CR0: 8005003b CR2: f865b058 CR3: 35d44000 CR4: 000407f0
[240617.544343] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[240617.544343] DR6: ffff0ff0 DR7: 00000400
[240617.544343] Process pppvd.temp (pid: 1911, ti=f54da000 
task=f5a4e660 task.ti=f54da000)
[240617.544343] Stack:
[240617.544343]  c02364db
f51d48bc
00000000
00000000
f54dbc44
c02e958c
c1adca18
00000010

[240617.544343]  00000002
00000000
f51d4820
f51d47f0
c1adc800
f5542c00
00047df0
00000000

[240617.544343]  0003d80b
00000000
03be13c6
00000000
08ade9a3
00000000
00000000
00000000

[240617.544343] Call Trace:
[240617.544343]  [<c02364db>] ? nla_reserve+0x2b/0x34
[240617.544343]  [<c02e958c>] rtnl_fill_ifinfo+0x3ce/0x77e
[240617.544343]  [<c02ea6b8>] rtnl_dump_ifinfo+0x115/0x1ad
[240617.544343]  [<c02f672c>] netlink_dump+0x57/0x1a8
[240617.544343]  [<c02d6c13>] ? consume_skb+0x2b/0x2e
[240617.544343]  [<c02f69f6>] netlink_recvmsg+0x179/0x248
[240617.544343]  [<c02d005e>] sock_recvmsg+0xb5/0xce
[240617.544343]  [<c01575cb>] ? arch_local_irq_save+0x8/0xb
[240617.544343]  [<c01898e6>] ? might_fault+0x73/0x79
[240617.544343]  [<c02d8400>] ? copy_from_user+0x8/0xa
[240617.544343]  [<c02d8729>] ? verify_iovec+0x3e/0x75
[240617.544343]  [<c02cfd93>] __sys_recvmsg+0xf8/0x17e
[240617.544343]  [<c02cffa9>] ? sock_sendmsg_nosec+0xc2/0xc2
[240617.544343]  [<c01575cb>] ? arch_local_irq_save+0x8/0xb
[240617.544343]  [<c022d4ef>] ? __copy_to_user_ll+0x1c/0x4b
[240617.544343]  [<c022d979>] ? copy_to_user+0x3f/0x46
[240617.544343]  [<c01a63d8>] ? cp_new_stat64+0xe1/0xf3
[240617.544343]  [<c01a42e2>] ? fget_light+0x2b/0x7c
[240617.544343]  [<c02d1646>] sys_recvmsg+0x36/0x4d
[240617.544343]  [<c02d1a8b>] sys_socketcall+0x239/0x27e
[240617.544343]  [<c022d2ec>] ? trace_hardirqs_on_thunk+0xc/0x10
[240617.544343]  [<c034e511>] syscall_call+0x7/0xb
[240617.544343]  [<c0340000>] ? acpi_os_map_memory+0x87/0x13e
[240617.544343] Code:
51
04
89
0a
c7
00
00
01
10
00
c7
40
04
00
02
20
00
f0
80
60
08
fe
5d
c3
55
89
e5
57
56
89
c6
53
89
d3
83
ec
04
8b
80
4c
01
00
00

78
3c
00
89
45
f0
74
15
31
c0
89
d7
b9
2e
00
00
00
f3
ab
89

[240617.544343] EIP: [<c02db99e>]
dev_get_stats+0x13/0x65
SS:ESP 0068:f54dbb28
[240617.544343] CR2: 00000000f865b058
[240617.544343] ---[ end trace ff4846e7d272f02d ]---
[240617.544343] Kernel panic - not syncing: Fatal exception
[240617.544343] Rebooting in 5 seconds..


---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply

* pull request: can-next 2012-06-07
From: Marc Kleine-Budde @ 2012-06-07  8:14 UTC (permalink / raw)
  To: David Miller; +Cc: Linux Netdev List, linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]

Hello David,

here two patches for net-next, by AnilKumar Ch, they add support for
Bosch's d_can hardware to the existing c_can driver.

regards, Marc

---

The following changes since commit c1864cfb80a64933c221e33fed9611356c031944:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2012-06-06 15:06:41 -0700)

are available in the git repository at:

  git://gitorious.org/linux-can/linux-can-next.git master

AnilKumar Ch (2):
      can: c_can: Move overlay structure to array with offset as index
      can: c_can: Add support for Bosch D_CAN controller

 drivers/net/can/c_can/Kconfig          |   13 ++-
 drivers/net/can/c_can/c_can.c          |  120 ++++++++++++-----------
 drivers/net/can/c_can/c_can.h          |  163 ++++++++++++++++++++++++--------
 drivers/net/can/c_can/c_can_platform.c |   76 ++++++++++-----
 4 files changed, 247 insertions(+), 125 deletions(-)

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: 3.5.0+ - Linus GIT - WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xeb/0x15f()
From: Eric Dumazet @ 2012-06-07  7:14 UTC (permalink / raw)
  To: Miles Lane
  Cc: LKML, Andrew Morton, Wim Van Sebroeck, Jay Cliburn, Chris Snook,
	netdev, Huang Xiong
In-Reply-To: <1339051157.26966.97.camel@edumazet-glaptop>

On Thu, 2012-06-07 at 08:39 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 02:16 -0400, Miles Lane wrote:
> > WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xeb/0x15f()
> > Hardware name: UL50VT
> > NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
> > Modules linked in: hfsplus hfs vfat msdos fat snd_hrtimer ipv6
> > snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep
> > snd_pcm_oss snd_seq_dummy snd_mixer_oss uvcvideo videobuf2_core
> > snd_pcm videodev snd_seq_oss snd_seq_midi snd_rawmidi media
> > snd_seq_midi_event acpi_cpufreq videobuf2_vmalloc videobuf2_memops
> > snd_seq iwlwifi snd_timer snd_seq_device asus_laptop mac80211
> > sparse_keymap snd cfg80211 coretemp soundcore psmouse snd_page_alloc
> > rtc_cmos mperf processor evdev rfkill battery led_class input_polldev
> > ac i915 nouveau sr_mod cdrom sd_mod ehci_hcd atl1c uhci_hcd intel_agp
> > ttm usbcore intel_gtt usb_common drm_kms_helper thermal video
> > thermal_sys hwmon button
> > Pid: 3025, comm: hud-service Not tainted 3.5.0-rc1+ #128
> > Call Trace:
> >  <IRQ>  [<ffffffff8102d42f>] warn_slowpath_common+0x7e/0x97
> >  [<ffffffff8102d4dc>] warn_slowpath_fmt+0x41/0x43
> >  [<ffffffff81360f1c>] dev_watchdog+0xeb/0x15f
> >  [<ffffffff8103af44>] run_timer_softirq+0x20e/0x356
> >  [<ffffffff8103ae7e>] ? run_timer_softirq+0x148/0x356
> >  [<ffffffff81360e31>] ? netif_tx_unlock+0x57/0x57
> >  [<ffffffff810344f8>] __do_softirq+0x103/0x239
> >  [<ffffffff8107122a>] ? clockevents_program_event+0x9c/0xb9
> >  [<ffffffff8140a4cc>] call_softirq+0x1c/0x30
> >  [<ffffffff81003bb9>] do_softirq+0x37/0x82
> >  [<ffffffff81034888>] irq_exit+0x4c/0xb1
> >  [<ffffffff8101ba71>] smp_apic_timer_interrupt+0x76/0x84
> >  [<ffffffff81409adc>] apic_timer_interrupt+0x6c/0x80
> >  <EOI>  [<ffffffff81105161>] ? fget_raw_light+0x4c/0x7d
> >  [<ffffffff81105161>] ? fget_raw_light+0x4c/0x7d
> >  [<ffffffff8111153b>] sys_fcntl+0x23/0x53b
> >  [<ffffffff81004b68>] ? print_context_stack+0x44/0xb1
> >  [<ffffffff81408fe2>] system_call_fastpath+0x16/0x1b
> > ---[ end trace c1f284d9c873031d ]---
> 
> CC netdev and Huang Xiong 
> 
> Atheros drivers are known to have buggy tx completion, its incredible...
> 
> You could try following patch, not a 'perfect' solution, but a fix.

And if you feel lucky, you could try the following one as well, a step
into right direction :

 drivers/net/ethernet/atheros/atl1c/atl1c_main.c |   86 ++++----------
 1 file changed, 30 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 9cc1570..44940f4 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -1528,6 +1528,16 @@ static inline void atl1c_clear_phy_int(struct atl1c_adapter *adapter)
 	spin_unlock(&adapter->mdio_lock);
 }
 
+static inline u16 atl1c_tpd_avail(const struct atl1c_tpd_ring *tpd_ring)
+{
+	u16 next_to_use = tpd_ring->next_to_use;
+	u16 next_to_clean = atomic_read(&tpd_ring->next_to_clean);
+
+	return (u16)(next_to_clean > next_to_use) ?
+		(next_to_clean - next_to_use - 1) :
+		(tpd_ring->count + next_to_clean - next_to_use - 1);
+}
+
 static bool atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
 				enum atl1c_trans_queue type)
 {
@@ -1551,10 +1561,14 @@ static bool atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
 		atomic_set(&tpd_ring->next_to_clean, next_to_clean);
 	}
 
+	spin_lock(&adapter->tx_lock);
+
 	if (netif_queue_stopped(adapter->netdev) &&
-			netif_carrier_ok(adapter->netdev)) {
+	    netif_carrier_ok(adapter->netdev) &&
+	    atl1c_tpd_avail(tpd_ring) >= tpd_ring->count / 4)
 		netif_wake_queue(adapter->netdev);
-	}
+
+	spin_unlock(&adapter->tx_lock);
 
 	return true;
 }
@@ -1856,20 +1870,6 @@ static void atl1c_netpoll(struct net_device *netdev)
 }
 #endif
 
-static inline u16 atl1c_tpd_avail(struct atl1c_adapter *adapter, enum atl1c_trans_queue type)
-{
-	struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
-	u16 next_to_use = 0;
-	u16 next_to_clean = 0;
-
-	next_to_clean = atomic_read(&tpd_ring->next_to_clean);
-	next_to_use   = tpd_ring->next_to_use;
-
-	return (u16)(next_to_clean > next_to_use) ?
-		(next_to_clean - next_to_use - 1) :
-		(tpd_ring->count + next_to_clean - next_to_use - 1);
-}
-
 /*
  * get next usable tpd
  * Note: should call atl1c_tdp_avail to make sure
@@ -1899,24 +1899,6 @@ atl1c_get_tx_buffer(struct atl1c_adapter *adapter, struct atl1c_tpd_desc *tpd)
 			(struct atl1c_tpd_desc *)tpd_ring->desc];
 }
 
-/* Calculate the transmit packet descript needed*/
-static u16 atl1c_cal_tpd_req(const struct sk_buff *skb)
-{
-	u16 tpd_req;
-	u16 proto_hdr_len = 0;
-
-	tpd_req = skb_shinfo(skb)->nr_frags + 1;
-
-	if (skb_is_gso(skb)) {
-		proto_hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
-		if (proto_hdr_len < skb_headlen(skb))
-			tpd_req++;
-		if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
-			tpd_req++;
-	}
-	return tpd_req;
-}
-
 static int atl1c_tso_csum(struct atl1c_adapter *adapter,
 			  struct sk_buff *skb,
 			  struct atl1c_tpd_desc **tpd,
@@ -2099,10 +2081,10 @@ static void atl1c_tx_map(struct atl1c_adapter *adapter,
 	buffer_info->skb = skb;
 }
 
-static void atl1c_tx_queue(struct atl1c_adapter *adapter, struct sk_buff *skb,
-			   struct atl1c_tpd_desc *tpd, enum atl1c_trans_queue type)
+static void atl1c_tx_queue(const struct atl1c_adapter *adapter,
+			   const struct atl1c_tpd_ring *tpd_ring,
+			   enum atl1c_trans_queue type)
 {
-	struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
 	u16 reg;
 
 	reg = type == atl1c_trans_high ? REG_TPD_PRI1_PIDX : REG_TPD_PRI0_PIDX;
@@ -2113,35 +2095,19 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 					  struct net_device *netdev)
 {
 	struct atl1c_adapter *adapter = netdev_priv(netdev);
-	unsigned long flags;
-	u16 tpd_req = 1;
 	struct atl1c_tpd_desc *tpd;
 	enum atl1c_trans_queue type = atl1c_trans_normal;
+	const struct atl1c_tpd_ring *tpd_ring = &adapter->tpd_ring[type];
 
 	if (test_bit(__AT_DOWN, &adapter->flags)) {
 		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
 
-	tpd_req = atl1c_cal_tpd_req(skb);
-	if (!spin_trylock_irqsave(&adapter->tx_lock, flags)) {
-		if (netif_msg_pktdata(adapter))
-			dev_info(&adapter->pdev->dev, "tx locked\n");
-		return NETDEV_TX_LOCKED;
-	}
-
-	if (atl1c_tpd_avail(adapter, type) < tpd_req) {
-		/* no enough descriptor, just stop queue */
-		netif_stop_queue(netdev);
-		spin_unlock_irqrestore(&adapter->tx_lock, flags);
-		return NETDEV_TX_BUSY;
-	}
-
 	tpd = atl1c_get_tpd(adapter, type);
 
 	/* do TSO and check sum */
 	if (atl1c_tso_csum(adapter, skb, &tpd, type) != 0) {
-		spin_unlock_irqrestore(&adapter->tx_lock, flags);
 		dev_kfree_skb_any(skb);
 		return NETDEV_TX_OK;
 	}
@@ -2160,9 +2126,17 @@ static netdev_tx_t atl1c_xmit_frame(struct sk_buff *skb,
 		tpd->word1 |= 1 << TPD_ETH_TYPE_SHIFT; /* Ethernet frame */
 
 	atl1c_tx_map(adapter, skb, tpd, type);
-	atl1c_tx_queue(adapter, skb, tpd, type);
+	atl1c_tx_queue(adapter, tpd_ring, type);
+
+	if (atl1c_tpd_avail(tpd_ring) < MAX_SKB_FRAGS + 4) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&adapter->tx_lock, flags);
+		if (atl1c_tpd_avail(tpd_ring) < MAX_SKB_FRAGS + 4)
+			netif_stop_queue(netdev);
+		spin_unlock_irqrestore(&adapter->tx_lock, flags);
+	}
 
-	spin_unlock_irqrestore(&adapter->tx_lock, flags);
 	return NETDEV_TX_OK;
 }
 

^ permalink raw reply related

* Re: 3.5.0+ - Linus GIT - WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xeb/0x15f()
From: Eric Dumazet @ 2012-06-07  6:39 UTC (permalink / raw)
  To: Miles Lane
  Cc: LKML, Andrew Morton, Wim Van Sebroeck, Jay Cliburn, Chris Snook,
	netdev, Huang Xiong
In-Reply-To: <CAHFgRy9jgxOrF=b=oQd-zK5CKxDacOKdBAX_BEuyW+R+sK_GyQ@mail.gmail.com>

On Thu, 2012-06-07 at 02:16 -0400, Miles Lane wrote:
> WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0xeb/0x15f()
> Hardware name: UL50VT
> NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
> Modules linked in: hfsplus hfs vfat msdos fat snd_hrtimer ipv6
> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep
> snd_pcm_oss snd_seq_dummy snd_mixer_oss uvcvideo videobuf2_core
> snd_pcm videodev snd_seq_oss snd_seq_midi snd_rawmidi media
> snd_seq_midi_event acpi_cpufreq videobuf2_vmalloc videobuf2_memops
> snd_seq iwlwifi snd_timer snd_seq_device asus_laptop mac80211
> sparse_keymap snd cfg80211 coretemp soundcore psmouse snd_page_alloc
> rtc_cmos mperf processor evdev rfkill battery led_class input_polldev
> ac i915 nouveau sr_mod cdrom sd_mod ehci_hcd atl1c uhci_hcd intel_agp
> ttm usbcore intel_gtt usb_common drm_kms_helper thermal video
> thermal_sys hwmon button
> Pid: 3025, comm: hud-service Not tainted 3.5.0-rc1+ #128
> Call Trace:
>  <IRQ>  [<ffffffff8102d42f>] warn_slowpath_common+0x7e/0x97
>  [<ffffffff8102d4dc>] warn_slowpath_fmt+0x41/0x43
>  [<ffffffff81360f1c>] dev_watchdog+0xeb/0x15f
>  [<ffffffff8103af44>] run_timer_softirq+0x20e/0x356
>  [<ffffffff8103ae7e>] ? run_timer_softirq+0x148/0x356
>  [<ffffffff81360e31>] ? netif_tx_unlock+0x57/0x57
>  [<ffffffff810344f8>] __do_softirq+0x103/0x239
>  [<ffffffff8107122a>] ? clockevents_program_event+0x9c/0xb9
>  [<ffffffff8140a4cc>] call_softirq+0x1c/0x30
>  [<ffffffff81003bb9>] do_softirq+0x37/0x82
>  [<ffffffff81034888>] irq_exit+0x4c/0xb1
>  [<ffffffff8101ba71>] smp_apic_timer_interrupt+0x76/0x84
>  [<ffffffff81409adc>] apic_timer_interrupt+0x6c/0x80
>  <EOI>  [<ffffffff81105161>] ? fget_raw_light+0x4c/0x7d
>  [<ffffffff81105161>] ? fget_raw_light+0x4c/0x7d
>  [<ffffffff8111153b>] sys_fcntl+0x23/0x53b
>  [<ffffffff81004b68>] ? print_context_stack+0x44/0xb1
>  [<ffffffff81408fe2>] system_call_fastpath+0x16/0x1b
> ---[ end trace c1f284d9c873031d ]---

CC netdev and Huang Xiong 

Atheros drivers are known to have buggy tx completion, its incredible...

You could try following patch, not a 'perfect' solution, but a fix.

Thanks

diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 9cc1570..31224f3 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -1551,10 +1551,12 @@ static bool atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
 		atomic_set(&tpd_ring->next_to_clean, next_to_clean);
 	}
 
+	spin_lock(&adapter->tx_lock);
 	if (netif_queue_stopped(adapter->netdev) &&
 			netif_carrier_ok(adapter->netdev)) {
 		netif_wake_queue(adapter->netdev);
 	}
+	spin_unlock(&adapter->tx_lock);
 
 	return true;
 }

^ permalink raw reply related

* Re: [PATCH IPROUTE2] ss: Add support for sk_meminfo_backlog
From: Eric Dumazet @ 2012-06-07  5:32 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: netdev, Stephen Hemminger
In-Reply-To: <CAGK4HS9pZmbkEHtu_e71_XxQ6RpSVc3tES6=z4WiGrMW45z_EA@mail.gmail.com>

On Wed, 2012-06-06 at 22:18 -0700, Vijay Subramanian wrote:
> > This is not the right way to handle this.
> >
> > I already have a patch and was waiting the appropriate time to submit
> > it.
> >
> Thanks. I will wait for your patch to see what I missed.
> 
> Vijay

Well, the trick is that we must support previous kernels (3.5 for
example), so should not display backlog info on them.

I'll send the patch, dont worry ;)

^ permalink raw reply

* Re: [PATCH IPROUTE2] ss: Add support for sk_meminfo_backlog
From: Vijay Subramanian @ 2012-06-07  5:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Stephen Hemminger
In-Reply-To: <1339042844.26966.75.camel@edumazet-glaptop>

> This is not the right way to handle this.
>
> I already have a patch and was waiting the appropriate time to submit
> it.
>
Thanks. I will wait for your patch to see what I missed.

Vijay

^ permalink raw reply

* Re: [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
From: Eric Dumazet @ 2012-06-07  5:03 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: Hiroaki SHIMODA, davem@davemloft.net, denys@visp.net.lb,
	therbert@google.com, netdev@vger.kernel.org
In-Reply-To: <1339044752.2075.14.camel@jtkirshe-mobl>

On Wed, 2012-06-06 at 21:52 -0700, Jeff Kirsher wrote:

> Jesse did not share any performance numbers with me, I am sure he can
> give some background tomorrow when he is back online.
> 
> I am working on an alternative patch now and should have something to
> share tomorrow.

Thanks !

^ permalink raw reply

* Re: [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
From: Jeff Kirsher @ 2012-06-07  4:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hiroaki SHIMODA, davem@davemloft.net, denys@visp.net.lb,
	therbert@google.com, netdev@vger.kernel.org
In-Reply-To: <1339043085.26966.77.camel@edumazet-glaptop>

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

On Thu, 2012-06-07 at 06:24 +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 17:59 -0700, Jeff Kirsher wrote:
> 
> > After further internal review, NACK.
> > 
> > This patch will cause unacceptable performance issues with non-ESB2
> > parts.
> > 
> > I am dropping this patch from my queue.
> > 
> 
> I'd like you share your performance numbers before NACKing this patch.
> 
> What is the alternative patch you guys have ?
> 

Jesse did not share any performance numbers with me, I am sure he can
give some background tomorrow when he is back online.

I am working on an alternative patch now and should have something to
share tomorrow.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
From: Eric Dumazet @ 2012-06-07  4:24 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: Hiroaki SHIMODA, davem@davemloft.net, denys@visp.net.lb,
	therbert@google.com, netdev@vger.kernel.org
In-Reply-To: <1339030752.2075.1.camel@jtkirshe-mobl>

On Wed, 2012-06-06 at 17:59 -0700, Jeff Kirsher wrote:

> After further internal review, NACK.
> 
> This patch will cause unacceptable performance issues with non-ESB2
> parts.
> 
> I am dropping this patch from my queue.
> 

I'd like you share your performance numbers before NACKing this patch.

What is the alternative patch you guys have ?

^ permalink raw reply

* Re: [PATCH IPROUTE2] ss: Add support for sk_meminfo_backlog
From: Eric Dumazet @ 2012-06-07  4:20 UTC (permalink / raw)
  To: Vijay Subramanian; +Cc: netdev, Stephen Hemminger
In-Reply-To: <1339024292-4361-1-git-send-email-subramanian.vijay@gmail.com>

On Wed, 2012-06-06 at 16:11 -0700, Vijay Subramanian wrote:
> This adds the ability to print the backlog length of sockets that is provided by
> recent Linux kernels since commit (d594e987c6 sock_diag: add
> SK_MEMINFO_BACKLOG).
> 
> Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
> ---
>  include/linux/sock_diag.h |    1 +
>  misc/ss.c                 |    5 +++--
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sock_diag.h b/include/linux/sock_diag.h
> index 39e4b1c..ac9db19 100644
> --- a/include/linux/sock_diag.h
> +++ b/include/linux/sock_diag.h
> @@ -18,6 +18,7 @@ enum {
>  	SK_MEMINFO_FWD_ALLOC,
>  	SK_MEMINFO_WMEM_QUEUED,
>  	SK_MEMINFO_OPTMEM,
> +	SK_MEMINFO_BACKLOG,
>  
>  	SK_MEMINFO_VARS,
>  };
> diff --git a/misc/ss.c b/misc/ss.c
> index cf529ef..ea14e2b 100644
> --- a/misc/ss.c
> +++ b/misc/ss.c
> @@ -1338,14 +1338,15 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r)
>  
>  	if (tb[INET_DIAG_SKMEMINFO]) {
>  		const __u32 *skmeminfo =  RTA_DATA(tb[INET_DIAG_SKMEMINFO]);
> -		printf(" skmem:(r%u,rb%u,t%u,tb%u,f%u,w%u,o%u)",
> +		printf(" skmem:(r%u,rb%u,t%u,tb%u,f%u,w%u,o%u,bl%u)",
>  			skmeminfo[SK_MEMINFO_RMEM_ALLOC],
>  			skmeminfo[SK_MEMINFO_RCVBUF],
>  			skmeminfo[SK_MEMINFO_WMEM_ALLOC],
>  			skmeminfo[SK_MEMINFO_SNDBUF],
>  			skmeminfo[SK_MEMINFO_FWD_ALLOC],
>  			skmeminfo[SK_MEMINFO_WMEM_QUEUED],
> -			skmeminfo[SK_MEMINFO_OPTMEM]);
> +			skmeminfo[SK_MEMINFO_OPTMEM],
> +			skmeminfo[SK_MEMINFO_BACKLOG]);
>  	}else if (tb[INET_DIAG_MEMINFO]) {
>  		const struct inet_diag_meminfo *minfo
>  			= RTA_DATA(tb[INET_DIAG_MEMINFO]);


This is not the right way to handle this.

I already have a patch and was waiting the appropriate time to submit
it.

Thanks

^ permalink raw reply

* Re: tcp wifi upload performance and lots of ACKs
From: Ben Greear @ 2012-06-07  4:15 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: netdev
In-Reply-To: <CAEnQRZCNUYmP88Ocm_nG7gpA1Qcwy1tOc6kgCgZ7RqXcxQsHhg@mail.gmail.com>

On 06/04/2012 12:22 PM, Daniel Baluta wrote:
> On Mon, Jun 4, 2012 at 9:29 PM, Ben Greear<greearb@candelatech.com>  wrote:
>> I'm going some TCP performance testing on wifi ->  LAN interface connections.
>>   With
>> UDP, we can get around 250Mbps of payload throughput.  With TCP, max is
>> about 80Mbps.
>>
>> I think the problem is that there are way too many ACK packets, and
>> bi-directional
>> traffic on wifi interfaces really slows things down.  (About 7000 pkts per
>> second in
>> upload direction, 2000 pps download.  And the vast majority of the download
>> pkts
>> are 66 byte ACK pkts from what I can tell.)

> [1] http://marc.info/?l=linux-netdev&m=131983649130350&w=2

After a bit more playing, I did notice a reliable 5% increase in
traffic (200Mbps -> 210Mbps) from changing the delack segments
to 20 from the default of 1.  That is enough to be useful to me,
and there may be more significant gains to be found...
I haven't done a full matrix of testing yet.

I read through the original thread, and to summarize:

* Need to make the values per-socket.
* No multiplication in hot path.
* Would be nice to make it automatic.

The first seems fairly trivial..just add a new set of socket-opts (or
maybe just one that can take all 3 values?) and store the settings in
the socket structs.

As for getting rid of the multiply..I think you cannot just use shifts.
That does not give the needed granularity.  An alternative is
to update a cached computation every time the mss or socket-opt changes.

As for a magic heuristic to figure this out, I think that would be
quite tricky to do right.  So, maybe add that in the future, but
for the present, just allowing applications to set the value seems
enough.

We could support configurable system-wide defaults so that users of
programs that do not know this new sockopt can still take advantage
of the feature.

Does this sound like a reasonable solution?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [v4 net-next PATCH 3/3] bnx2x: Added EEE Ethtool support.
From: Yuval Mintz @ 2012-06-07  3:13 UTC (permalink / raw)
  To: davem, netdev; +Cc: eilong, peppe.cavallaro, bhutchings, Yuval Mintz
In-Reply-To: <1339038788-3447-1-git-send-email-yuvalmin@broadcom.com>

This patch extends the bnx2x's ethtool interface to enable
control in the eee feature, as well as report statistic information
about it.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c    |  134 ++++++++++++++++++++
 1 files changed, 134 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index ddc18ee..bf30e28 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -177,6 +177,8 @@ static const struct {
 			4, STATS_FLAGS_FUNC, "recoverable_errors" },
 	{ STATS_OFFSET32(unrecoverable_error),
 			4, STATS_FLAGS_FUNC, "unrecoverable_errors" },
+	{ STATS_OFFSET32(eee_tx_lpi),
+			4, STATS_FLAGS_PORT, "Tx LPI entry count"}
 };
 
 #define BNX2X_NUM_STATS		ARRAY_SIZE(bnx2x_stats_arr)
@@ -1543,6 +1545,136 @@ static const struct {
 	{ "idle check (online)" }
 };
 
+static u32 bnx2x_eee_to_adv(u32 eee_adv)
+{
+	u32 modes = 0;
+
+	if (eee_adv & SHMEM_EEE_100M_ADV)
+		modes |= ADVERTISED_100baseT_Full;
+	if (eee_adv & SHMEM_EEE_1G_ADV)
+		modes |= ADVERTISED_1000baseT_Full;
+	if (eee_adv & SHMEM_EEE_10G_ADV)
+		modes |= ADVERTISED_10000baseT_Full;
+
+	return modes;
+}
+
+static u32 bnx2x_adv_to_eee(u32 modes, u32 shift)
+{
+	u32 eee_adv = 0;
+	if (modes & ADVERTISED_100baseT_Full)
+		eee_adv |= SHMEM_EEE_100M_ADV;
+	if (modes & ADVERTISED_1000baseT_Full)
+		eee_adv |= SHMEM_EEE_1G_ADV;
+	if (modes & ADVERTISED_10000baseT_Full)
+		eee_adv |= SHMEM_EEE_10G_ADV;
+
+	return eee_adv << shift;
+}
+
+static int bnx2x_get_eee(struct net_device *dev, struct ethtool_eee *edata)
+{
+	struct bnx2x *bp = netdev_priv(dev);
+	u32 eee_cfg;
+
+	if (!SHMEM2_HAS(bp, eee_status[BP_PORT(bp)])) {
+		DP(BNX2X_MSG_ETHTOOL, "BC Version does not support EEE\n");
+		return -EOPNOTSUPP;
+	}
+
+	eee_cfg = SHMEM2_RD(bp, eee_status[BP_PORT(bp)]);
+
+	edata->supported =
+		bnx2x_eee_to_adv((eee_cfg & SHMEM_EEE_SUPPORTED_MASK) >>
+				 SHMEM_EEE_SUPPORTED_SHIFT);
+
+	edata->advertised =
+		bnx2x_eee_to_adv((eee_cfg & SHMEM_EEE_ADV_STATUS_MASK) >>
+				 SHMEM_EEE_ADV_STATUS_SHIFT);
+	edata->lp_advertised =
+		bnx2x_eee_to_adv((eee_cfg & SHMEM_EEE_LP_ADV_STATUS_MASK) >>
+				 SHMEM_EEE_LP_ADV_STATUS_SHIFT);
+
+	/* SHMEM value is in 16u units --> Convert to 1u units. */
+	edata->tx_lpi_timer = (eee_cfg & SHMEM_EEE_TIMER_MASK) << 4;
+
+	edata->eee_enabled    = (eee_cfg & SHMEM_EEE_REQUESTED_BIT)	? 1 : 0;
+	edata->eee_active     = (eee_cfg & SHMEM_EEE_ACTIVE_BIT)	? 1 : 0;
+	edata->tx_lpi_enabled = (eee_cfg & SHMEM_EEE_LPI_REQUESTED_BIT) ? 1 : 0;
+
+	return 0;
+}
+
+static int bnx2x_set_eee(struct net_device *dev, struct ethtool_eee *edata)
+{
+	struct bnx2x *bp = netdev_priv(dev);
+	u32 eee_cfg;
+	u32 advertised;
+
+	if (IS_MF(bp))
+		return 0;
+
+	if (!SHMEM2_HAS(bp, eee_status[BP_PORT(bp)])) {
+		DP(BNX2X_MSG_ETHTOOL, "BC Version does not support EEE\n");
+		return -EOPNOTSUPP;
+	}
+
+	eee_cfg = SHMEM2_RD(bp, eee_status[BP_PORT(bp)]);
+
+	if (!(eee_cfg & SHMEM_EEE_SUPPORTED_MASK)) {
+		DP(BNX2X_MSG_ETHTOOL, "Board does not support EEE!\n");
+		return -EOPNOTSUPP;
+	}
+
+	advertised = bnx2x_adv_to_eee(edata->advertised,
+				      SHMEM_EEE_ADV_STATUS_SHIFT);
+	if ((advertised != (eee_cfg & SHMEM_EEE_ADV_STATUS_MASK))) {
+		DP(BNX2X_MSG_ETHTOOL,
+		   "Direct manipulation of EEE advertisment is not supported\n");
+		return -EINVAL;
+	}
+
+	if (edata->tx_lpi_timer > EEE_MODE_TIMER_MASK) {
+		DP(BNX2X_MSG_ETHTOOL,
+		   "Maximal Tx Lpi timer supported is %x(u)\n",
+		   EEE_MODE_TIMER_MASK);
+		return -EINVAL;
+	}
+	if (edata->tx_lpi_enabled &&
+	    (edata->tx_lpi_timer < EEE_MODE_NVRAM_AGGRESSIVE_TIME)) {
+		DP(BNX2X_MSG_ETHTOOL,
+		   "Minimal Tx Lpi timer supported is %d(u)\n",
+		   EEE_MODE_NVRAM_AGGRESSIVE_TIME);
+		return -EINVAL;
+	}
+
+	/* All is well; Apply changes*/
+	if (edata->eee_enabled)
+		bp->link_params.eee_mode |= EEE_MODE_ADV_LPI;
+	else
+		bp->link_params.eee_mode &= ~EEE_MODE_ADV_LPI;
+
+	if (edata->tx_lpi_enabled)
+		bp->link_params.eee_mode |= EEE_MODE_ENABLE_LPI;
+	else
+		bp->link_params.eee_mode &= ~EEE_MODE_ENABLE_LPI;
+
+	bp->link_params.eee_mode &= ~EEE_MODE_TIMER_MASK;
+	bp->link_params.eee_mode |= (edata->tx_lpi_timer &
+				    EEE_MODE_TIMER_MASK) |
+				    EEE_MODE_OVERRIDE_NVRAM |
+				    EEE_MODE_OUTPUT_TIME;
+
+	/* Restart link to propogate changes */
+	if (netif_running(dev)) {
+		bnx2x_stats_handle(bp, STATS_EVENT_STOP);
+		bnx2x_link_set(bp);
+	}
+
+	return 0;
+}
+
+
 enum {
 	BNX2X_CHIP_E1_OFST = 0,
 	BNX2X_CHIP_E1H_OFST,
@@ -2472,6 +2604,8 @@ static const struct ethtool_ops bnx2x_ethtool_ops = {
 	.get_rxfh_indir_size	= bnx2x_get_rxfh_indir_size,
 	.get_rxfh_indir		= bnx2x_get_rxfh_indir,
 	.set_rxfh_indir		= bnx2x_set_rxfh_indir,
+	.get_eee		= bnx2x_get_eee,
+	.set_eee		= bnx2x_set_eee,
 };
 
 void bnx2x_set_ethtool_ops(struct net_device *netdev)
-- 
1.7.9.rc2

^ permalink raw reply related

* [v4 net-next PATCH 2/3] bnx2x: Added EEE support
From: Yuval Mintz @ 2012-06-07  3:13 UTC (permalink / raw)
  To: davem, netdev; +Cc: eilong, peppe.cavallaro, bhutchings, Yuval Mintz
In-Reply-To: <1339038788-3447-1-git-send-email-yuvalmin@broadcom.com>

This patch adds energy efficient energy support (802.3az) to bnx2x
boards with 84833 phys (and sufficiently new BC and external FW).

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h   |   61 ++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c  |  323 ++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h  |   26 ++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   23 ++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h   |  123 ++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c |    4 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h |    2 +
 7 files changed, 552 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
index a440a8b..c61aa37 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h
@@ -1067,8 +1067,18 @@ struct port_feat_cfg {		    /* port 0: 0x454  port 1: 0x4c8 */
 	   uses the same defines as link_config */
 	u32 mfw_wol_link_cfg2;				    /* 0x480 */
 
-	u32 Reserved2[17];				    /* 0x484 */
 
+	/*  EEE power saving mode */
+	u32 eee_power_mode;                                 /* 0x484 */
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_MASK                     0x000000FF
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_SHIFT                    0
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_DISABLED                 0x00000000
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_BALANCED                 0x00000001
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_AGGRESSIVE               0x00000002
+	#define PORT_FEAT_CFG_EEE_POWER_MODE_LOW_LATENCY              0x00000003
+
+
+	u32 Reserved2[16];                                  /* 0x488 */
 };
 
 
@@ -1255,6 +1265,8 @@ struct drv_func_mb {
 	#define DRV_MSG_CODE_DRV_INFO_ACK               0xd8000000
 	#define DRV_MSG_CODE_DRV_INFO_NACK              0xd9000000
 
+	#define DRV_MSG_CODE_EEE_RESULTS_ACK            0xda000000
+
 	#define DRV_MSG_CODE_SET_MF_BW                  0xe0000000
 	#define REQ_BC_VER_4_SET_MF_BW                  0x00060202
 	#define DRV_MSG_CODE_SET_MF_BW_ACK              0xe1000000
@@ -1320,6 +1332,8 @@ struct drv_func_mb {
 	#define FW_MSG_CODE_DRV_INFO_ACK                0xd8100000
 	#define FW_MSG_CODE_DRV_INFO_NACK               0xd9100000
 
+	#define FW_MSG_CODE_EEE_RESULS_ACK              0xda100000
+
 	#define FW_MSG_CODE_SET_MF_BW_SENT              0xe0000000
 	#define FW_MSG_CODE_SET_MF_BW_DONE              0xe1000000
 
@@ -1383,6 +1397,8 @@ struct drv_func_mb {
 
 	#define DRV_STATUS_DRV_INFO_REQ                 0x04000000
 
+	#define DRV_STATUS_EEE_NEGOTIATION_RESULTS      0x08000000
+
 	u32 virt_mac_upper;
 	#define VIRT_MAC_SIGN_MASK                      0xffff0000
 	#define VIRT_MAC_SIGNATURE                      0x564d0000
@@ -1613,6 +1629,11 @@ struct fw_flr_mb {
 	struct fw_flr_ack ack;
 };
 
+struct eee_remote_vals {
+	u32         tx_tw;
+	u32         rx_tw;
+};
+
 /**** SUPPORT FOR SHMEM ARRRAYS ***
  * The SHMEM HSI is aligned on 32 bit boundaries which makes it difficult to
  * define arrays with storage types smaller then unsigned dwords.
@@ -2053,6 +2074,41 @@ struct shmem2_region {
 #define DRV_INFO_CONTROL_OP_CODE_MASK      0x0000ff00
 #define DRV_INFO_CONTROL_OP_CODE_SHIFT     8
 	u32 ibft_host_addr; /* initialized by option ROM */
+	struct eee_remote_vals eee_remote_vals[PORT_MAX];
+	u32 reserved[E2_FUNC_MAX];
+
+
+	/* the status of EEE auto-negotiation
+	 * bits 15:0 the configured tx-lpi entry timer value. Depends on bit 31.
+	 * bits 19:16 the supported modes for EEE.
+	 * bits 23:20 the speeds advertised for EEE.
+	 * bits 27:24 the speeds the Link partner advertised for EEE.
+	 * The supported/adv. modes in bits 27:19 originate from the
+	 * SHMEM_EEE_XXX_ADV definitions (where XXX is replaced by speed).
+	 * bit 28 when 1'b1 EEE was requested.
+	 * bit 29 when 1'b1 tx lpi was requested.
+	 * bit 30 when 1'b1 EEE was negotiated. Tx lpi will be asserted iff
+	 * 30:29 are 2'b11.
+	 * bit 31 when 1'b0 bits 15:0 contain a PORT_FEAT_CFG_EEE_ define as
+	 * value. When 1'b1 those bits contains a value times 16 microseconds.
+	 */
+	u32 eee_status[PORT_MAX];
+	#define SHMEM_EEE_TIMER_MASK		   0x0000ffff
+	#define SHMEM_EEE_SUPPORTED_MASK	   0x000f0000
+	#define SHMEM_EEE_SUPPORTED_SHIFT	   16
+	#define SHMEM_EEE_ADV_STATUS_MASK	   0x00f00000
+		#define SHMEM_EEE_100M_ADV	   (1<<0)
+		#define SHMEM_EEE_1G_ADV	   (1<<1)
+		#define SHMEM_EEE_10G_ADV	   (1<<2)
+	#define SHMEM_EEE_ADV_STATUS_SHIFT	   20
+	#define	SHMEM_EEE_LP_ADV_STATUS_MASK	   0x0f000000
+	#define SHMEM_EEE_LP_ADV_STATUS_SHIFT	   24
+	#define SHMEM_EEE_REQUESTED_BIT		   0x10000000
+	#define SHMEM_EEE_LPI_REQUESTED_BIT	   0x20000000
+	#define SHMEM_EEE_ACTIVE_BIT		   0x40000000
+	#define SHMEM_EEE_TIME_OUTPUT_BIT	   0x80000000
+
+	u32 sizeof_port_stats;
 };
 
 
@@ -2599,6 +2655,9 @@ struct host_port_stats {
 	u32            pfc_frames_tx_lo;
 	u32            pfc_frames_rx_hi;
 	u32            pfc_frames_rx_lo;
+
+	u32            eee_lpi_count_hi;
+	u32            eee_lpi_count_lo;
 };
 
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index a3fb721..c7c814d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -1305,6 +1305,94 @@ int bnx2x_ets_strict(const struct link_params *params, const u8 strict_cos)
 
 	return 0;
 }
+
+/******************************************************************/
+/*			EEE section				   */
+/******************************************************************/
+static u8 bnx2x_eee_has_cap(struct link_params *params)
+{
+	struct bnx2x *bp = params->bp;
+
+	if (REG_RD(bp, params->shmem2_base) <=
+		   offsetof(struct shmem2_region, eee_status[params->port]))
+		return 0;
+
+	return 1;
+}
+
+static int bnx2x_eee_nvram_to_time(u32 nvram_mode, u32 *idle_timer)
+{
+	switch (nvram_mode) {
+	case PORT_FEAT_CFG_EEE_POWER_MODE_BALANCED:
+		*idle_timer = EEE_MODE_NVRAM_BALANCED_TIME;
+		break;
+	case PORT_FEAT_CFG_EEE_POWER_MODE_AGGRESSIVE:
+		*idle_timer = EEE_MODE_NVRAM_AGGRESSIVE_TIME;
+		break;
+	case PORT_FEAT_CFG_EEE_POWER_MODE_LOW_LATENCY:
+		*idle_timer = EEE_MODE_NVRAM_LATENCY_TIME;
+		break;
+	default:
+		*idle_timer = 0;
+		break;
+	}
+
+	return 0;
+}
+
+static int bnx2x_eee_time_to_nvram(u32 idle_timer, u32 *nvram_mode)
+{
+	switch (idle_timer) {
+	case EEE_MODE_NVRAM_BALANCED_TIME:
+		*nvram_mode = PORT_FEAT_CFG_EEE_POWER_MODE_BALANCED;
+		break;
+	case EEE_MODE_NVRAM_AGGRESSIVE_TIME:
+		*nvram_mode = PORT_FEAT_CFG_EEE_POWER_MODE_AGGRESSIVE;
+		break;
+	case EEE_MODE_NVRAM_LATENCY_TIME:
+		*nvram_mode = PORT_FEAT_CFG_EEE_POWER_MODE_LOW_LATENCY;
+		break;
+	default:
+		*nvram_mode = PORT_FEAT_CFG_EEE_POWER_MODE_DISABLED;
+		break;
+	}
+
+	return 0;
+}
+
+static u32 bnx2x_eee_calc_timer(struct link_params *params)
+{
+	u32 eee_mode, eee_idle;
+	struct bnx2x *bp = params->bp;
+
+	if (params->eee_mode & EEE_MODE_OVERRIDE_NVRAM) {
+		if (params->eee_mode & EEE_MODE_OUTPUT_TIME) {
+			/* time value in eee_mode --> used directly*/
+			eee_idle = params->eee_mode & EEE_MODE_TIMER_MASK;
+		} else {
+			/* hsi value in eee_mode --> time */
+			if (bnx2x_eee_nvram_to_time(params->eee_mode &
+						    EEE_MODE_NVRAM_MASK,
+						    &eee_idle))
+				return 0;
+		}
+	} else {
+		/* hsi values in nvram --> time*/
+		eee_mode = ((REG_RD(bp, params->shmem_base +
+				    offsetof(struct shmem_region, dev_info.
+				    port_feature_config[params->port].
+				    eee_power_mode)) &
+			     PORT_FEAT_CFG_EEE_POWER_MODE_MASK) >>
+			    PORT_FEAT_CFG_EEE_POWER_MODE_SHIFT);
+
+		if (bnx2x_eee_nvram_to_time(eee_mode, &eee_idle))
+			return 0;
+	}
+
+	return eee_idle;
+}
+
+
 /******************************************************************/
 /*			PFC section				  */
 /******************************************************************/
@@ -1729,6 +1817,14 @@ static int bnx2x_xmac_enable(struct link_params *params,
 	/* update PFC */
 	bnx2x_update_pfc_xmac(params, vars, 0);
 
+	if (vars->eee_status & SHMEM_EEE_ADV_STATUS_MASK) {
+		DP(NETIF_MSG_LINK, "Setting XMAC for EEE\n");
+		REG_WR(bp, xmac_base + XMAC_REG_EEE_TIMERS_HI, 0x1380008);
+		REG_WR(bp, xmac_base + XMAC_REG_EEE_CTRL, 0x1);
+	} else {
+		REG_WR(bp, xmac_base + XMAC_REG_EEE_CTRL, 0x0);
+	}
+
 	/* Enable TX and RX */
 	val = XMAC_CTRL_REG_TX_EN | XMAC_CTRL_REG_RX_EN;
 
@@ -2439,6 +2535,16 @@ static void bnx2x_update_mng(struct link_params *params, u32 link_status)
 			port_mb[params->port].link_status), link_status);
 }
 
+static void bnx2x_update_mng_eee(struct link_params *params, u32 eee_status)
+{
+	struct bnx2x *bp = params->bp;
+
+	if (bnx2x_eee_has_cap(params))
+		REG_WR(bp, params->shmem2_base +
+		       offsetof(struct shmem2_region,
+				eee_status[params->port]), eee_status);
+}
+
 static void bnx2x_update_pfc_nig(struct link_params *params,
 		struct link_vars *vars,
 		struct bnx2x_nig_brb_pfc_port_params *nig_params)
@@ -3950,6 +4056,20 @@ static void bnx2x_warpcore_set_10G_XFI(struct bnx2x_phy *phy,
 	bnx2x_cl45_write(bp, phy, MDIO_WC_DEVAD,
 			 MDIO_WC_REG_DIGITAL4_MISC3, val | 0x8080);
 
+	/* Enable LPI pass through */
+	if ((params->eee_mode & EEE_MODE_ADV_LPI) &&
+	    (phy->flags & FLAGS_EEE_10GBT) &&
+	    (!(params->eee_mode & EEE_MODE_ENABLE_LPI) ||
+	      bnx2x_eee_calc_timer(params)) &&
+	    (params->req_duplex[bnx2x_phy_selection(params)] == DUPLEX_FULL)) {
+		DP(NETIF_MSG_LINK, "Configure WC for LPI pass through\n");
+		bnx2x_cl45_write(bp, phy, MDIO_WC_DEVAD,
+				 MDIO_WC_REG_EEE_COMBO_CONTROL0,
+				 0x7c);
+		bnx2x_cl45_read_or_write(bp, phy, MDIO_WC_DEVAD,
+					 MDIO_WC_REG_DIGITAL4_MISC5, 0xc000);
+	}
+
 	/* 10G XFI Full Duplex */
 	bnx2x_cl45_write(bp, phy, MDIO_WC_DEVAD,
 			 MDIO_WC_REG_IEEE0BLK_MIICNTL, 0x100);
@@ -6462,6 +6582,15 @@ static int bnx2x_update_link_down(struct link_params *params,
 	       (MISC_REGISTERS_RESET_REG_2_RST_BMAC0 << port));
 	}
 	if (CHIP_IS_E3(bp)) {
+		REG_WR(bp, MISC_REG_CPMU_LP_FW_ENABLE_P0 + (params->port << 2),
+		       0);
+		REG_WR(bp, MISC_REG_CPMU_LP_DR_ENABLE, 0);
+		REG_WR(bp, MISC_REG_CPMU_LP_MASK_ENT_P0 + (params->port << 2),
+		       0);
+		vars->eee_status &= ~(SHMEM_EEE_LP_ADV_STATUS_MASK |
+				      SHMEM_EEE_ACTIVE_BIT);
+
+		bnx2x_update_mng_eee(params, vars->eee_status);
 		bnx2x_xmac_disable(params);
 		bnx2x_umac_disable(params);
 	}
@@ -6501,6 +6630,16 @@ static int bnx2x_update_link_up(struct link_params *params,
 			bnx2x_umac_enable(params, vars, 0);
 		bnx2x_set_led(params, vars,
 			      LED_MODE_OPER, vars->line_speed);
+
+		if ((vars->eee_status & SHMEM_EEE_ACTIVE_BIT) &&
+		    (vars->eee_status & SHMEM_EEE_LPI_REQUESTED_BIT)) {
+			DP(NETIF_MSG_LINK, "Enabling LPI assertion\n");
+			REG_WR(bp, MISC_REG_CPMU_LP_FW_ENABLE_P0 +
+			       (params->port << 2), 1);
+			REG_WR(bp, MISC_REG_CPMU_LP_DR_ENABLE, 1);
+			REG_WR(bp, MISC_REG_CPMU_LP_MASK_ENT_P0 +
+			       (params->port << 2), 0xfc20);
+		}
 	}
 	if ((CHIP_IS_E1x(bp) ||
 	     CHIP_IS_E2(bp))) {
@@ -6538,7 +6677,7 @@ static int bnx2x_update_link_up(struct link_params *params,
 
 	/* update shared memory */
 	bnx2x_update_mng(params, vars->link_status);
-
+	bnx2x_update_mng_eee(params, vars->eee_status);
 	/* Check remote fault */
 	for (phy_idx = INT_PHY; phy_idx < MAX_PHYS; phy_idx++) {
 		if (params->phy[phy_idx].flags & FLAGS_TX_ERROR_CHECK) {
@@ -6582,6 +6721,8 @@ int bnx2x_link_update(struct link_params *params, struct link_vars *vars)
 		phy_vars[phy_index].phy_link_up = 0;
 		phy_vars[phy_index].link_up = 0;
 		phy_vars[phy_index].fault_detected = 0;
+		/* different consideration, since vars holds inner state */
+		phy_vars[phy_index].eee_status = vars->eee_status;
 	}
 
 	if (USES_WARPCORE(bp))
@@ -6711,6 +6852,9 @@ int bnx2x_link_update(struct link_params *params, struct link_vars *vars)
 			vars->link_status |= LINK_STATUS_SERDES_LINK;
 		else
 			vars->link_status &= ~LINK_STATUS_SERDES_LINK;
+
+		vars->eee_status = phy_vars[active_external_phy].eee_status;
+
 		DP(NETIF_MSG_LINK, "Active external phy selected: %x\n",
 			   active_external_phy);
 	}
@@ -9579,9 +9723,9 @@ static int bnx2x_8481_config_init(struct bnx2x_phy *phy,
 static int bnx2x_84833_cmd_hdlr(struct bnx2x_phy *phy,
 				   struct link_params *params,
 		   u16 fw_cmd,
-		   u16 cmd_args[])
+		   u16 cmd_args[], int argc)
 {
-	u32 idx;
+	int idx;
 	u16 val;
 	struct bnx2x *bp = params->bp;
 	/* Write CMD_OPEN_OVERRIDE to STATUS reg */
@@ -9601,7 +9745,7 @@ static int bnx2x_84833_cmd_hdlr(struct bnx2x_phy *phy,
 	}
 
 	/* Prepare argument(s) and issue command */
-	for (idx = 0; idx < PHY84833_CMDHDLR_MAX_ARGS; idx++) {
+	for (idx = 0; idx < argc; idx++) {
 		bnx2x_cl45_write(bp, phy, MDIO_CTL_DEVAD,
 				MDIO_84833_CMD_HDLR_DATA1 + idx,
 				cmd_args[idx]);
@@ -9622,7 +9766,7 @@ static int bnx2x_84833_cmd_hdlr(struct bnx2x_phy *phy,
 		return -EINVAL;
 	}
 	/* Gather returning data */
-	for (idx = 0; idx < PHY84833_CMDHDLR_MAX_ARGS; idx++) {
+	for (idx = 0; idx < argc; idx++) {
 		bnx2x_cl45_read(bp, phy, MDIO_CTL_DEVAD,
 				MDIO_84833_CMD_HDLR_DATA1 + idx,
 				&cmd_args[idx]);
@@ -9656,7 +9800,7 @@ static int bnx2x_84833_pair_swap_cfg(struct bnx2x_phy *phy,
 	data[1] = (u16)pair_swap;
 
 	status = bnx2x_84833_cmd_hdlr(phy, params,
-		PHY84833_CMD_SET_PAIR_SWAP, data);
+		PHY84833_CMD_SET_PAIR_SWAP, data, PHY84833_CMDHDLR_MAX_ARGS);
 	if (status == 0)
 		DP(NETIF_MSG_LINK, "Pairswap OK, val=0x%x\n", data[1]);
 
@@ -9734,6 +9878,95 @@ static int bnx2x_84833_hw_reset_phy(struct bnx2x_phy *phy,
 	return 0;
 }
 
+static int bnx2x_8483x_eee_timers(struct link_params *params,
+				   struct link_vars *vars)
+{
+	u32 eee_idle = 0, eee_mode;
+	struct bnx2x *bp = params->bp;
+
+	eee_idle = bnx2x_eee_calc_timer(params);
+
+	if (eee_idle) {
+		REG_WR(bp, MISC_REG_CPMU_LP_IDLE_THR_P0 + (params->port << 2),
+		       eee_idle);
+	} else if ((params->eee_mode & EEE_MODE_ENABLE_LPI) &&
+		   (params->eee_mode & EEE_MODE_OVERRIDE_NVRAM) &&
+		   (params->eee_mode & EEE_MODE_OUTPUT_TIME)) {
+		DP(NETIF_MSG_LINK, "Error: Tx LPI is enabled with timer 0\n");
+		return -EINVAL;
+	}
+
+	vars->eee_status &= ~(SHMEM_EEE_TIMER_MASK | SHMEM_EEE_TIME_OUTPUT_BIT);
+	if (params->eee_mode & EEE_MODE_OUTPUT_TIME) {
+		/* eee_idle in 1u --> eee_status in 16u */
+		eee_idle >>= 4;
+		vars->eee_status |= (eee_idle & SHMEM_EEE_TIMER_MASK) |
+				    SHMEM_EEE_TIME_OUTPUT_BIT;
+	} else {
+		if (bnx2x_eee_time_to_nvram(eee_idle, &eee_mode))
+			return -EINVAL;
+		vars->eee_status |= eee_mode;
+	}
+
+	return 0;
+}
+
+static int bnx2x_8483x_disable_eee(struct bnx2x_phy *phy,
+				   struct link_params *params,
+				   struct link_vars *vars)
+{
+	int rc;
+	struct bnx2x *bp = params->bp;
+	u16 cmd_args = 0;
+
+	DP(NETIF_MSG_LINK, "Don't Advertise 10GBase-T EEE\n");
+
+	/* Make Certain LPI is disabled */
+	REG_WR(bp, MISC_REG_CPMU_LP_FW_ENABLE_P0 + (params->port << 2), 0);
+	REG_WR(bp, MISC_REG_CPMU_LP_DR_ENABLE, 0);
+
+	/* Prevent Phy from working in EEE and advertising it */
+	rc = bnx2x_84833_cmd_hdlr(phy, params,
+		PHY84833_CMD_SET_EEE_MODE, &cmd_args, 1);
+	if (rc != 0) {
+		DP(NETIF_MSG_LINK, "EEE disable failed.\n");
+		return rc;
+	}
+
+	bnx2x_cl45_write(bp, phy, MDIO_AN_DEVAD, MDIO_AN_REG_EEE_ADV, 0);
+	vars->eee_status &= ~SHMEM_EEE_ADV_STATUS_MASK;
+
+	return 0;
+}
+
+static int bnx2x_8483x_enable_eee(struct bnx2x_phy *phy,
+				   struct link_params *params,
+				   struct link_vars *vars)
+{
+	int rc;
+	struct bnx2x *bp = params->bp;
+	u16 cmd_args = 1;
+
+	DP(NETIF_MSG_LINK, "Advertise 10GBase-T EEE\n");
+
+	rc = bnx2x_84833_cmd_hdlr(phy, params,
+		PHY84833_CMD_SET_EEE_MODE, &cmd_args, 1);
+	if (rc != 0) {
+		DP(NETIF_MSG_LINK, "EEE enable failed.\n");
+		return rc;
+	}
+
+	bnx2x_cl45_write(bp, phy, MDIO_AN_DEVAD, MDIO_AN_REG_EEE_ADV, 0x8);
+
+	/* Mask events preventing LPI generation */
+	REG_WR(bp, MISC_REG_CPMU_LP_MASK_EXT_P0 + (params->port << 2), 0xfc20);
+
+	vars->eee_status &= ~SHMEM_EEE_ADV_STATUS_MASK;
+	vars->eee_status |= (SHMEM_EEE_10G_ADV << SHMEM_EEE_ADV_STATUS_SHIFT);
+
+	return 0;
+}
+
 #define PHY84833_CONSTANT_LATENCY 1193
 static int bnx2x_848x3_config_init(struct bnx2x_phy *phy,
 				   struct link_params *params,
@@ -9833,7 +10066,8 @@ static int bnx2x_848x3_config_init(struct bnx2x_phy *phy,
 		cmd_args[2] = PHY84833_CONSTANT_LATENCY + 1;
 		cmd_args[3] = PHY84833_CONSTANT_LATENCY;
 		rc = bnx2x_84833_cmd_hdlr(phy, params,
-			PHY84833_CMD_SET_EEE_MODE, cmd_args);
+			PHY84833_CMD_SET_EEE_MODE, cmd_args,
+			PHY84833_CMDHDLR_MAX_ARGS);
 		if (rc != 0)
 			DP(NETIF_MSG_LINK, "Cfg AutogrEEEn failed.\n");
 	}
@@ -9858,6 +10092,48 @@ static int bnx2x_848x3_config_init(struct bnx2x_phy *phy,
 				 MDIO_CTL_REG_84823_USER_CTRL_REG, val);
 	}
 
+	bnx2x_cl45_read(bp, phy, MDIO_CTL_DEVAD,
+			MDIO_84833_TOP_CFG_FW_REV, &val);
+
+	/* Configure EEE support */
+	if ((val >= MDIO_84833_TOP_CFG_FW_EEE) && bnx2x_eee_has_cap(params)) {
+		phy->flags |= FLAGS_EEE_10GBT;
+		vars->eee_status |= SHMEM_EEE_10G_ADV <<
+				    SHMEM_EEE_SUPPORTED_SHIFT;
+		/* Propogate params' bits --> vars (for migration exposure) */
+		if (params->eee_mode & EEE_MODE_ENABLE_LPI)
+			vars->eee_status |= SHMEM_EEE_LPI_REQUESTED_BIT;
+		else
+			vars->eee_status &= ~SHMEM_EEE_LPI_REQUESTED_BIT;
+
+		if (params->eee_mode & EEE_MODE_ADV_LPI)
+			vars->eee_status |= SHMEM_EEE_REQUESTED_BIT;
+		else
+			vars->eee_status &= ~SHMEM_EEE_REQUESTED_BIT;
+
+		rc = bnx2x_8483x_eee_timers(params, vars);
+		if (rc != 0) {
+			DP(NETIF_MSG_LINK, "Failed to configure EEE timers\n");
+			bnx2x_8483x_disable_eee(phy, params, vars);
+			return rc;
+		}
+
+		if ((params->req_duplex[actual_phy_selection] == DUPLEX_FULL) &&
+		    (params->eee_mode & EEE_MODE_ADV_LPI) &&
+		    (bnx2x_eee_calc_timer(params) ||
+		     !(params->eee_mode & EEE_MODE_ENABLE_LPI)))
+			rc = bnx2x_8483x_enable_eee(phy, params, vars);
+		else
+			rc = bnx2x_8483x_disable_eee(phy, params, vars);
+		if (rc != 0) {
+			DP(NETIF_MSG_LINK, "Failed to set EEE advertisment\n");
+			return rc;
+		}
+	} else {
+		phy->flags &= ~FLAGS_EEE_10GBT;
+		vars->eee_status &= ~SHMEM_EEE_SUPPORTED_MASK;
+	}
+
 	if (phy->type == PORT_HW_CFG_XGXS_EXT_PHY_TYPE_BCM84833) {
 		/* Bring PHY out of super isolate mode as the final step. */
 		bnx2x_cl45_read(bp, phy,
@@ -9989,6 +10265,31 @@ static u8 bnx2x_848xx_read_status(struct bnx2x_phy *phy,
 		if (val & (1<<11))
 			vars->link_status |=
 				LINK_STATUS_LINK_PARTNER_10GXFD_CAPABLE;
+
+		/* Determine if EEE was negotiated */
+		if (phy->type == PORT_HW_CFG_XGXS_EXT_PHY_TYPE_BCM84833) {
+			u32 eee_shmem = 0;
+
+			bnx2x_cl45_read(bp, phy, MDIO_AN_DEVAD,
+					MDIO_AN_REG_EEE_ADV, &val1);
+			bnx2x_cl45_read(bp, phy, MDIO_AN_DEVAD,
+					MDIO_AN_REG_LP_EEE_ADV, &val2);
+			if ((val1 & val2) & 0x8) {
+				DP(NETIF_MSG_LINK, "EEE negotiated\n");
+				vars->eee_status |= SHMEM_EEE_ACTIVE_BIT;
+			}
+
+			if (val2 & 0x12)
+				eee_shmem |= SHMEM_EEE_100M_ADV;
+			if (val2 & 0x4)
+				eee_shmem |= SHMEM_EEE_1G_ADV;
+			if (val2 & 0x68)
+				eee_shmem |= SHMEM_EEE_10G_ADV;
+
+			vars->eee_status &= ~SHMEM_EEE_LP_ADV_STATUS_MASK;
+			vars->eee_status |= (eee_shmem <<
+					     SHMEM_EEE_LP_ADV_STATUS_SHIFT);
+		}
 	}
 
 	return link_up;
@@ -11243,7 +11544,8 @@ static struct bnx2x_phy phy_84833 = {
 	.def_md_devad	= 0,
 	.flags		= (FLAGS_FAN_FAILURE_DET_REQ |
 			   FLAGS_REARM_LATCH_SIGNAL |
-			   FLAGS_TX_ERROR_CHECK),
+			   FLAGS_TX_ERROR_CHECK |
+			   FLAGS_EEE_10GBT),
 	.rx_preemphasis	= {0xffff, 0xffff, 0xffff, 0xffff},
 	.tx_preemphasis	= {0xffff, 0xffff, 0xffff, 0xffff},
 	.mdio_ctrl	= 0,
@@ -12011,6 +12313,8 @@ int bnx2x_phy_init(struct link_params *params, struct link_vars *vars)
 		break;
 	}
 	bnx2x_update_mng(params, vars->link_status);
+
+	bnx2x_update_mng_eee(params, vars->eee_status);
 	return 0;
 }
 
@@ -12023,6 +12327,9 @@ int bnx2x_link_reset(struct link_params *params, struct link_vars *vars,
 	/* disable attentions */
 	vars->link_status = 0;
 	bnx2x_update_mng(params, vars->link_status);
+	vars->eee_status &= ~(SHMEM_EEE_LP_ADV_STATUS_MASK |
+			      SHMEM_EEE_ACTIVE_BIT);
+	bnx2x_update_mng_eee(params, vars->eee_status);
 	bnx2x_bits_dis(bp, NIG_REG_MASK_INTERRUPT_PORT0 + port*4,
 		       (NIG_MASK_XGXS0_LINK_STATUS |
 			NIG_MASK_XGXS0_LINK10G |
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
index ea4371f..e920800 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h
@@ -149,6 +149,7 @@ struct bnx2x_phy {
 #define FLAGS_DUMMY_READ		(1<<9)
 #define FLAGS_MDC_MDIO_WA_B0		(1<<10)
 #define FLAGS_TX_ERROR_CHECK		(1<<12)
+#define FLAGS_EEE_10GBT			(1<<13)
 
 	/* preemphasis values for the rx side */
 	u16 rx_preemphasis[4];
@@ -265,6 +266,30 @@ struct link_params {
 	u8 num_phys;
 
 	u8 rsrv;
+
+	/* Used to configure the EEE Tx LPI timer, has several modes of
+	 * operation, according to bits 29:28 -
+	 * 2'b00: Timer will be configured by nvram, output will be the value
+	 *        from nvram.
+	 * 2'b01: Timer will be configured by nvram, output will be in
+	 *        microseconds.
+	 * 2'b10: bits 1:0 contain an nvram value which will be used instead
+	 *        of the one located in the nvram. Output will be that value.
+	 * 2'b11: bits 19:0 contain the idle timer in microseconds; output
+	 *        will be in microseconds.
+	 * Bits 31:30 should be 2'b11 in order for EEE to be enabled.
+	 */
+	u32 eee_mode;
+#define EEE_MODE_NVRAM_BALANCED_TIME		(0xa00)
+#define EEE_MODE_NVRAM_AGGRESSIVE_TIME		(0x100)
+#define EEE_MODE_NVRAM_LATENCY_TIME		(0x6000)
+#define EEE_MODE_NVRAM_MASK		(0x3)
+#define EEE_MODE_TIMER_MASK		(0xfffff)
+#define EEE_MODE_OUTPUT_TIME		(1<<28)
+#define EEE_MODE_OVERRIDE_NVRAM		(1<<29)
+#define EEE_MODE_ENABLE_LPI		(1<<30)
+#define EEE_MODE_ADV_LPI			(1<<31)
+
 	u16 hw_led_mode; /* part of the hw_config read from the shmem */
 	u32 multi_phy_config;
 
@@ -301,6 +326,7 @@ struct link_vars {
 
 	/* The same definitions as the shmem parameter */
 	u32 link_status;
+	u32 eee_status;
 	u8 fault_detected;
 	u8 rsrv1;
 	u16 periodic_flags;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index f755a66..a622bb7 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -3176,6 +3176,12 @@ static void bnx2x_set_mf_bw(struct bnx2x *bp)
 	bnx2x_fw_command(bp, DRV_MSG_CODE_SET_MF_BW_ACK, 0);
 }
 
+static void bnx2x_handle_eee_event(struct bnx2x *bp)
+{
+	DP(BNX2X_MSG_MCP, "EEE - LLDP event\n");
+	bnx2x_fw_command(bp, DRV_MSG_CODE_EEE_RESULTS_ACK, 0);
+}
+
 static void bnx2x_handle_drv_info_req(struct bnx2x *bp)
 {
 	enum drv_info_opcode op_code;
@@ -3742,6 +3748,8 @@ static void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
 			if (val & DRV_STATUS_AFEX_EVENT_MASK)
 				bnx2x_handle_afex_cmd(bp,
 					val & DRV_STATUS_AFEX_EVENT_MASK);
+			if (val & DRV_STATUS_EEE_NEGOTIATION_RESULTS)
+				bnx2x_handle_eee_event(bp);
 			if (bp->link_vars.periodic_flags &
 			    PERIODIC_FLAGS_LINK_EVENT) {
 				/*  sync with link */
@@ -10082,7 +10090,7 @@ static void __devinit bnx2x_get_port_hwinfo(struct bnx2x *bp)
 {
 	int port = BP_PORT(bp);
 	u32 config;
-	u32 ext_phy_type, ext_phy_config;
+	u32 ext_phy_type, ext_phy_config, eee_mode;
 
 	bp->link_params.bp = bp;
 	bp->link_params.port = port;
@@ -10149,6 +10157,19 @@ static void __devinit bnx2x_get_port_hwinfo(struct bnx2x *bp)
 		bp->port.need_hw_lock = bnx2x_hw_lock_required(bp,
 							bp->common.shmem_base,
 							bp->common.shmem2_base);
+
+	/* Configure link feature according to nvram value */
+	eee_mode = (((SHMEM_RD(bp, dev_info.
+		      port_feature_config[port].eee_power_mode)) &
+		     PORT_FEAT_CFG_EEE_POWER_MODE_MASK) >>
+		    PORT_FEAT_CFG_EEE_POWER_MODE_SHIFT);
+	if (eee_mode != PORT_FEAT_CFG_EEE_POWER_MODE_DISABLED) {
+		bp->link_params.eee_mode = EEE_MODE_ADV_LPI |
+					   EEE_MODE_ENABLE_LPI |
+					   EEE_MODE_OUTPUT_TIME;
+	} else {
+		bp->link_params.eee_mode = 0;
+	}
 }
 
 void bnx2x_get_iscsi_info(struct bnx2x *bp)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index bbd3874..bfef98f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -1488,6 +1488,121 @@
  * 2:1 - otp_misc_do[51:50]; 0 - otp_misc_do[1]. */
 #define MISC_REG_CHIP_TYPE					 0xac60
 #define MISC_REG_CHIP_TYPE_57811_MASK				 (1<<1)
+#define MISC_REG_CPMU_LP_DR_ENABLE				 0xa858
+/* [RW 1] FW EEE LPI Enable. When 1 indicates that EEE LPI mode is enabled
+ * by FW. When 0 indicates that the EEE LPI mode is disabled by FW. Clk
+ * 25MHz. Reset on hard reset. */
+#define MISC_REG_CPMU_LP_FW_ENABLE_P0				 0xa84c
+/* [RW 32] EEE LPI Idle Threshold. The threshold value for the idle EEE LPI
+ * counter. Timer tick is 1 us. Clock 25MHz. Reset on hard reset. */
+#define MISC_REG_CPMU_LP_IDLE_THR_P0				 0xa8a0
+/* [RW 18] LPI entry events mask. [0] - Vmain SM Mask. When 1 indicates that
+ * the Vmain SM end state is disabled. When 0 indicates that the Vmain SM
+ * end state is enabled. [1] - FW Queues Empty Mask. When 1 indicates that
+ * the FW command that all Queues are empty is disabled. When 0 indicates
+ * that the FW command that all Queues are empty is enabled. [2] - FW Early
+ * Exit Mask / Reserved (Entry mask). When 1 indicates that the FW Early
+ * Exit command is disabled. When 0 indicates that the FW Early Exit command
+ * is enabled. This bit applicable only in the EXIT Events Mask registers.
+ * [3] - PBF Request Mask. When 1 indicates that the PBF Request indication
+ * is disabled. When 0 indicates that the PBF Request indication is enabled.
+ * [4] - Tx Request Mask. When =1 indicates that the Tx other Than PBF
+ * Request indication is disabled. When 0 indicates that the Tx Other Than
+ * PBF Request indication is enabled. [5] - Rx EEE LPI Status Mask. When 1
+ * indicates that the RX EEE LPI Status indication is disabled. When 0
+ * indicates that the RX EEE LPI Status indication is enabled. In the EXIT
+ * Events Masks registers; this bit masks the falling edge detect of the LPI
+ * Status (Rx LPI is on - off). [6] - Tx Pause Mask. When 1 indicates that
+ * the Tx Pause indication is disabled. When 0 indicates that the Tx Pause
+ * indication is enabled. [7] - BRB1 Empty Mask. When 1 indicates that the
+ * BRB1 EMPTY indication is disabled. When 0 indicates that the BRB1 EMPTY
+ * indication is enabled. [8] - QM Idle Mask. When 1 indicates that the QM
+ * IDLE indication is disabled. When 0 indicates that the QM IDLE indication
+ * is enabled. (One bit for both VOQ0 and VOQ1). [9] - QM LB Idle Mask. When
+ * 1 indicates that the QM IDLE indication for LOOPBACK is disabled. When 0
+ * indicates that the QM IDLE indication for LOOPBACK is enabled. [10] - L1
+ * Status Mask. When 1 indicates that the L1 Status indication from the PCIE
+ * CORE is disabled. When 0 indicates that the RX EEE LPI Status indication
+ * from the PCIE CORE is enabled. In the EXIT Events Masks registers; this
+ * bit masks the falling edge detect of the L1 status (L1 is on - off). [11]
+ * - P0 E0 EEE EEE LPI REQ Mask. When =1 indicates that the P0 E0 EEE EEE
+ * LPI REQ indication is disabled. When =0 indicates that the P0 E0 EEE LPI
+ * REQ indication is enabled. [12] - P1 E0 EEE LPI REQ Mask. When =1
+ * indicates that the P0 EEE LPI REQ indication is disabled. When =0
+ * indicates that the P0 EEE LPI REQ indication is enabled. [13] - P0 E1 EEE
+ * LPI REQ Mask. When =1 indicates that the P0 EEE LPI REQ indication is
+ * disabled. When =0 indicates that the P0 EEE LPI REQ indication is
+ * enabled. [14] - P1 E1 EEE LPI REQ Mask. When =1 indicates that the P0 EEE
+ * LPI REQ indication is disabled. When =0 indicates that the P0 EEE LPI REQ
+ * indication is enabled. [15] - L1 REQ Mask. When =1 indicates that the L1
+ * REQ indication is disabled. When =0 indicates that the L1 indication is
+ * enabled. [16] - Rx EEE LPI Status Edge Detect Mask. When =1 indicates
+ * that the RX EEE LPI Status Falling Edge Detect indication is disabled (Rx
+ * EEE LPI is on - off). When =0 indicates that the RX EEE LPI Status
+ * Falling Edge Detec indication is enabled (Rx EEE LPI is on - off). This
+ * bit is applicable only in the EXIT Events Masks registers. [17] - L1
+ * Status Edge Detect Mask. When =1 indicates that the L1 Status Falling
+ * Edge Detect indication from the PCIE CORE is disabled (L1 is on - off).
+ * When =0 indicates that the L1 Status Falling Edge Detect indication from
+ * the PCIE CORE is enabled (L1 is on - off). This bit is applicable only in
+ * the EXIT Events Masks registers. Clock 25MHz. Reset on hard reset. */
+#define MISC_REG_CPMU_LP_MASK_ENT_P0				 0xa880
+/* [RW 18] EEE LPI exit events mask. [0] - Vmain SM Mask. When 1 indicates
+ * that the Vmain SM end state is disabled. When 0 indicates that the Vmain
+ * SM end state is enabled. [1] - FW Queues Empty Mask. When 1 indicates
+ * that the FW command that all Queues are empty is disabled. When 0
+ * indicates that the FW command that all Queues are empty is enabled. [2] -
+ * FW Early Exit Mask / Reserved (Entry mask). When 1 indicates that the FW
+ * Early Exit command is disabled. When 0 indicates that the FW Early Exit
+ * command is enabled. This bit applicable only in the EXIT Events Mask
+ * registers. [3] - PBF Request Mask. When 1 indicates that the PBF Request
+ * indication is disabled. When 0 indicates that the PBF Request indication
+ * is enabled. [4] - Tx Request Mask. When =1 indicates that the Tx other
+ * Than PBF Request indication is disabled. When 0 indicates that the Tx
+ * Other Than PBF Request indication is enabled. [5] - Rx EEE LPI Status
+ * Mask. When 1 indicates that the RX EEE LPI Status indication is disabled.
+ * When 0 indicates that the RX LPI Status indication is enabled. In the
+ * EXIT Events Masks registers; this bit masks the falling edge detect of
+ * the EEE LPI Status (Rx EEE LPI is on - off). [6] - Tx Pause Mask. When 1
+ * indicates that the Tx Pause indication is disabled. When 0 indicates that
+ * the Tx Pause indication is enabled. [7] - BRB1 Empty Mask. When 1
+ * indicates that the BRB1 EMPTY indication is disabled. When 0 indicates
+ * that the BRB1 EMPTY indication is enabled. [8] - QM Idle Mask. When 1
+ * indicates that the QM IDLE indication is disabled. When 0 indicates that
+ * the QM IDLE indication is enabled. (One bit for both VOQ0 and VOQ1). [9]
+ * - QM LB Idle Mask. When 1 indicates that the QM IDLE indication for
+ * LOOPBACK is disabled. When 0 indicates that the QM IDLE indication for
+ * LOOPBACK is enabled. [10] - L1 Status Mask. When 1 indicates that the L1
+ * Status indication from the PCIE CORE is disabled. When 0 indicates that
+ * the RX EEE LPI Status indication from the PCIE CORE is enabled. In the
+ * EXIT Events Masks registers; this bit masks the falling edge detect of
+ * the L1 status (L1 is on - off). [11] - P0 E0 EEE EEE LPI REQ Mask. When
+ * =1 indicates that the P0 E0 EEE EEE LPI REQ indication is disabled. When
+ * =0 indicates that the P0 E0 EEE LPI REQ indication is enabled. [12] - P1
+ * E0 EEE LPI REQ Mask. When =1 indicates that the P0 EEE LPI REQ indication
+ * is disabled. When =0 indicates that the P0 EEE LPI REQ indication is
+ * enabled. [13] - P0 E1 EEE LPI REQ Mask. When =1 indicates that the P0 EEE
+ * LPI REQ indication is disabled. When =0 indicates that the P0 EEE LPI REQ
+ * indication is enabled. [14] - P1 E1 EEE LPI REQ Mask. When =1 indicates
+ * that the P0 EEE LPI REQ indication is disabled. When =0 indicates that
+ * the P0 EEE LPI REQ indication is enabled. [15] - L1 REQ Mask. When =1
+ * indicates that the L1 REQ indication is disabled. When =0 indicates that
+ * the L1 indication is enabled. [16] - Rx EEE LPI Status Edge Detect Mask.
+ * When =1 indicates that the RX EEE LPI Status Falling Edge Detect
+ * indication is disabled (Rx EEE LPI is on - off). When =0 indicates that
+ * the RX EEE LPI Status Falling Edge Detec indication is enabled (Rx EEE
+ * LPI is on - off). This bit is applicable only in the EXIT Events Masks
+ * registers. [17] - L1 Status Edge Detect Mask. When =1 indicates that the
+ * L1 Status Falling Edge Detect indication from the PCIE CORE is disabled
+ * (L1 is on - off). When =0 indicates that the L1 Status Falling Edge
+ * Detect indication from the PCIE CORE is enabled (L1 is on - off). This
+ * bit is applicable only in the EXIT Events Masks registers.Clock 25MHz.
+ * Reset on hard reset. */
+#define MISC_REG_CPMU_LP_MASK_EXT_P0				 0xa888
+/* [RW 16] EEE LPI Entry Events Counter. A statistic counter with the number
+ * of counts that the SM entered the EEE LPI state. Clock 25MHz. Read only
+ * register. Reset on hard reset. */
+#define MISC_REG_CPMU_LP_SM_ENT_CNT_P0				 0xa8b8
 /* [RW 32] The following driver registers(1...16) represent 16 drivers and
    32 clients. Each client can be controlled by one driver only. One in each
    bit represent that this driver control the appropriate client (Ex: bit 5
@@ -5372,6 +5487,8 @@
 /* [RW 32] Lower 48 bits of ctrl_sa register. Used as the SA in PAUSE/PFC
  * packets transmitted by the MAC */
 #define XMAC_REG_CTRL_SA_LO					 0x28
+#define XMAC_REG_EEE_CTRL					 0xd8
+#define XMAC_REG_EEE_TIMERS_HI					 0xe4
 #define XMAC_REG_PAUSE_CTRL					 0x68
 #define XMAC_REG_PFC_CTRL					 0x70
 #define XMAC_REG_PFC_CTRL_HI					 0x74
@@ -6813,6 +6930,8 @@ Theotherbitsarereservedandshouldbezero*/
 #define MDIO_AN_REG_LP_AUTO_NEG		0x0013
 #define MDIO_AN_REG_LP_AUTO_NEG2	0x0014
 #define MDIO_AN_REG_MASTER_STATUS	0x0021
+#define MDIO_AN_REG_EEE_ADV		0x003c
+#define MDIO_AN_REG_LP_EEE_ADV		0x003d
 /*bcm*/
 #define MDIO_AN_REG_LINK_STATUS 	0x8304
 #define MDIO_AN_REG_CL37_CL73		0x8370
@@ -6866,6 +6985,8 @@ Theotherbitsarereservedandshouldbezero*/
 #define MDIO_PMA_REG_84823_LED3_STRETCH_EN			0x0080
 
 /* BCM84833 only */
+#define MDIO_84833_TOP_CFG_FW_REV			0x400f
+#define MDIO_84833_TOP_CFG_FW_EEE		0x10b1
 #define MDIO_84833_TOP_CFG_XGPHY_STRAP1			0x401a
 #define MDIO_84833_SUPER_ISOLATE		0x8000
 /* These are mailbox register set used by 84833. */
@@ -6993,11 +7114,13 @@ Theotherbitsarereservedandshouldbezero*/
 #define MDIO_WC_REG_DIGITAL3_UP1			0x8329
 #define MDIO_WC_REG_DIGITAL3_LP_UP1			 0x832c
 #define MDIO_WC_REG_DIGITAL4_MISC3			0x833c
+#define MDIO_WC_REG_DIGITAL4_MISC5			0x833e
 #define MDIO_WC_REG_DIGITAL5_MISC6			0x8345
 #define MDIO_WC_REG_DIGITAL5_MISC7			0x8349
 #define MDIO_WC_REG_DIGITAL5_ACTUAL_SPEED		0x834e
 #define MDIO_WC_REG_DIGITAL6_MP5_NEXTPAGECTRL		0x8350
 #define MDIO_WC_REG_CL49_USERB0_CTRL			0x8368
+#define MDIO_WC_REG_EEE_COMBO_CONTROL0			0x8390
 #define MDIO_WC_REG_TX66_CONTROL			0x83b0
 #define MDIO_WC_REG_RX66_CONTROL			0x83c0
 #define MDIO_WC_REG_RX66_SCW0				0x83c2
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
index 1e2785c..0e8bdcb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
@@ -785,6 +785,10 @@ static int bnx2x_hw_stats_update(struct bnx2x *bp)
 
 	pstats->host_port_stats_counter++;
 
+	if (CHIP_IS_E3(bp))
+		estats->eee_tx_lpi += REG_RD(bp,
+					     MISC_REG_CPMU_LP_SM_ENT_CNT_P0);
+
 	if (!BP_NOMCP(bp)) {
 		u32 nig_timer_max =
 			SHMEM_RD(bp, port_mb[BP_PORT(bp)].stat_nig_timer);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
index 93e689fd..24b8e50 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.h
@@ -203,6 +203,8 @@ struct bnx2x_eth_stats {
 	/* Recovery */
 	u32 recoverable_error;
 	u32 unrecoverable_error;
+	/* src: Clear-on-Read register; Will not survive PMF Migration */
+	u32 eee_tx_lpi;
 };
 
 
-- 
1.7.9.rc2

^ permalink raw reply related

* [v4 net-next PATCH 0/3] Energy Efficient Ethernet (eee) support
From: Yuval Mintz @ 2012-06-07  3:13 UTC (permalink / raw)
  To: davem, netdev; +Cc: eilong, peppe.cavallaro, bhutchings, Yuval Mintz

Hi Dave,

This patch series adds energy efficient ethernet support for the
bnx2x driver (for new chips with appropriate phys). 
It also extends the ethtool API to enable control of the eee feature.

Another patch series has been sent to Ben to allow the ethtool application
to use this new API.

Changes from Version 3:
	Patch 1/3:
		-Corrected function pointer check in 'ethtool_set_eee'.

Changes from Version 2:
	Patch 1/3:
		-Corrected ethtool_eee documentation in ethtool.h.

Changes from Version 1:
	Patch 1/3:
		-Added documentation to ethtool_eee struct in header.
		-Clearing the ethtool_eee struct before passing to driver.
		-Checking the driver's return value of 'get_eee' call.
	Patches 2-3/3:
		-Corrected conversion of tx_lpi_timer speeds in bnx2x.

Please consider applying it to 'net-next'.

Thanks,
Yuval Mintz

^ permalink raw reply

* [v4 net-next PATCH 1/3] Added kernel support in EEE Ethtool commands
From: Yuval Mintz @ 2012-06-07  3:13 UTC (permalink / raw)
  To: davem, netdev; +Cc: eilong, peppe.cavallaro, bhutchings, Yuval Mintz
In-Reply-To: <1339038788-3447-1-git-send-email-yuvalmin@broadcom.com>

This patch extends the kernel's ethtool interface by adding support
for 2 new EEE commands - get_eee and set_eee.

Thanks goes to Giuseppe Cavallaro for his original patch adding this support.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 include/linux/ethtool.h |   35 +++++++++++++++++++++++++++++++++++
 net/core/ethtool.c      |   40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index e17fa71..a518361 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -137,6 +137,35 @@ struct ethtool_eeprom {
 };
 
 /**
+ * struct ethtool_eee - Energy Efficient Ethernet information
+ * @cmd: ETHTOOL_{G,S}EEE
+ * @supported: Mask of %SUPPORTED_* flags for the speed/duplex combinations
+ *	for which there is EEE support.
+ * @advertised: Mask of %ADVERTISED_* flags for the speed/duplex combinations
+ *	advertised as eee capable.
+ * @lp_advertised: Mask of %ADVERTISED_* flags for the speed/duplex
+ *	combinations advertised by the link partner as eee capable.
+ * @eee_active: Result of the eee auto negotiation.
+ * @eee_enabled: EEE configured mode (enabled/disabled).
+ * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
+ *	that eee was negotiated.
+ * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
+ *	its tx lpi (after reaching 'idle' state). Effective only when eee
+ *	was negotiated and tx_lpi_enabled was set.
+ */
+struct ethtool_eee {
+	__u32	cmd;
+	__u32	supported;
+	__u32	advertised;
+	__u32	lp_advertised;
+	__u32	eee_active;
+	__u32	eee_enabled;
+	__u32	tx_lpi_enabled;
+	__u32	tx_lpi_timer;
+	__u32	reserved[2];
+};
+
+/**
  * struct ethtool_modinfo - plugin module eeprom information
  * @cmd: %ETHTOOL_GMODULEINFO
  * @type: Standard the module information conforms to %ETH_MODULE_SFF_xxxx
@@ -945,6 +974,8 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
  * @get_module_info: Get the size and type of the eeprom contained within
  *	a plug-in module.
  * @get_module_eeprom: Get the eeprom information from the plug-in module
+ * @get_eee: Get Energy-Efficient (EEE) supported and status.
+ * @set_eee: Set EEE status (enable/disable) as well as LPI timers.
  *
  * All operations are optional (i.e. the function pointer may be set
  * to %NULL) and callers must take this into account.  Callers must
@@ -1011,6 +1042,8 @@ struct ethtool_ops {
 				   struct ethtool_modinfo *);
 	int     (*get_module_eeprom)(struct net_device *,
 				     struct ethtool_eeprom *, u8 *);
+	int	(*get_eee)(struct net_device *, struct ethtool_eee *);
+	int	(*set_eee)(struct net_device *, struct ethtool_eee *);
 
 
 };
@@ -1089,6 +1122,8 @@ struct ethtool_ops {
 #define ETHTOOL_GET_TS_INFO	0x00000041 /* Get time stamping and PHC info */
 #define ETHTOOL_GMODULEINFO	0x00000042 /* Get plug-in module information */
 #define ETHTOOL_GMODULEEEPROM	0x00000043 /* Get plug-in module eeprom */
+#define ETHTOOL_GEEE		0x00000044 /* Get EEE settings */
+#define ETHTOOL_SEEE		0x00000045 /* Set EEE settings */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 9c2afb4..5a582da 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -729,6 +729,40 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr)
 	return dev->ethtool_ops->set_wol(dev, &wol);
 }
 
+static int ethtool_get_eee(struct net_device *dev, char __user *useraddr)
+{
+	struct ethtool_eee edata;
+	int rc;
+
+	if (!dev->ethtool_ops->get_eee)
+		return -EOPNOTSUPP;
+
+	memset(&edata, 0, sizeof(struct ethtool_eee));
+	edata.cmd = ETHTOOL_GEEE;
+	rc = dev->ethtool_ops->get_eee(dev, &edata);
+
+	if (rc)
+		return rc;
+
+	if (copy_to_user(useraddr, &edata, sizeof(edata)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int ethtool_set_eee(struct net_device *dev, char __user *useraddr)
+{
+	struct ethtool_eee edata;
+
+	if (!dev->ethtool_ops->set_eee)
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&edata, useraddr, sizeof(edata)))
+		return -EFAULT;
+
+	return dev->ethtool_ops->set_eee(dev, &edata);
+}
+
 static int ethtool_nway_reset(struct net_device *dev)
 {
 	if (!dev->ethtool_ops->nway_reset)
@@ -1471,6 +1505,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 		rc = ethtool_set_value_void(dev, useraddr,
 				       dev->ethtool_ops->set_msglevel);
 		break;
+	case ETHTOOL_GEEE:
+		rc = ethtool_get_eee(dev, useraddr);
+		break;
+	case ETHTOOL_SEEE:
+		rc = ethtool_set_eee(dev, useraddr);
+		break;
 	case ETHTOOL_NWAY_RST:
 		rc = ethtool_nway_reset(dev);
 		break;
-- 
1.7.9.rc2

^ permalink raw reply related

* Re: [PATCH 1/2] e1000e: Disable ASPM L1 on 82574
From: Greg KH @ 2012-06-07  1:41 UTC (permalink / raw)
  To: Chris Boot
  Cc: e1000-devel, netdev, linux-kernel, nix, carolyn.wyborny, stable
In-Reply-To: <4FC93154.1060906@bootc.net>

On Fri, Jun 01, 2012 at 10:17:08PM +0100, Chris Boot wrote:
> On 23/04/2012 22:29, Chris Boot wrote:
> > ASPM on the 82574 causes trouble. Currently the driver disables L0s for
> > this NIC but only disables L1 if the MTU is >1500. This patch simply
> > causes L1 to be disabled regardless of the MTU setting.
> >
> > Signed-off-by: Chris Boot <bootc@bootc.net>
> > Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com>
> > Cc: Nix <nix@esperi.org.uk>
> > Link: https://lkml.org/lkml/2012/3/19/362
> > ---
> >  drivers/net/ethernet/intel/e1000e/82571.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/intel/e1000e/82571.c b/drivers/net/ethernet/intel/e1000e/82571.c
> > index b3fdc69..c6d95f2 100644
> > --- a/drivers/net/ethernet/intel/e1000e/82571.c
> > +++ b/drivers/net/ethernet/intel/e1000e/82571.c
> > @@ -2061,8 +2061,9 @@ const struct e1000_info e1000_82574_info = {
> >  				  | FLAG_HAS_SMART_POWER_DOWN
> >  				  | FLAG_HAS_AMT
> >  				  | FLAG_HAS_CTRLEXT_ON_LOAD,
> > -	.flags2			  = FLAG2_CHECK_PHY_HANG
> > +	.flags2			= FLAG2_CHECK_PHY_HANG
> >  				  | FLAG2_DISABLE_ASPM_L0S
> > +				  | FLAG2_DISABLE_ASPM_L1
> >  				  | FLAG2_NO_DISABLE_RX,
> >  	.pba			= 32,
> >  	.max_hw_frame_size	= DEFAULT_JUMBO,
> 
> Now that this patch is in master (d4a4206e) and has presumably been
> widely tested, what's the possibility of it making it into stable? I
> really should have included a CC to stable when I sent it...

I'd be glad to apply it, but it doesn't apply properly to the 3.4-stable
tree :(

> This patch should probably also be accompanied with 59aed952 (e1000e:
> Remove special case for 82573/82574 ASPM L1 disablement) on top, to
> remove a special case that's no longer required once this is applied.

As I can't apply the first one, this one shouldn't be applied either at
this point in time...

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
From: David Miller @ 2012-06-07  1:34 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: shimoda.hiroaki, eric.dumazet, denys, therbert, netdev
In-Reply-To: <1339030752.2075.1.camel@jtkirshe-mobl>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 06 Jun 2012 17:59:12 -0700

> This patch will cause unacceptable performance issues with non-ESB2
> parts.

You have to fix the regression a non-1 setting causes, performance
is secondary.

^ permalink raw reply

* [PATCH net-next] net: Update kernel-doc for __alloc_skb()
From: Ben Hutchings @ 2012-06-07  1:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Grant Edwards
In-Reply-To: <jqoe5d$fs3$1@dough.gmane.org>

__alloc_skb() now extends tailroom to allow the use of padding added
by the heap allocator.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 net/core/skbuff.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 016694d..1d74cea 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -160,8 +160,8 @@ static void skb_under_panic(struct sk_buff *skb, int sz, void *here)
  *	@node: numa node to allocate memory on
  *
  *	Allocate a new &sk_buff. The returned buffer has no headroom and a
- *	tail room of size bytes. The object has a reference count of one.
- *	The return is the buffer. On a failure the return is %NULL.
+ *	tail room of at least size bytes. The object has a reference count
+ *	of one. The return is the buffer. On a failure the return is %NULL.
  *
  *	Buffers may only be allocated from interrupts using a @gfp_mask of
  *	%GFP_ATOMIC.
-- 
1.7.7.6


-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* Re: [PATCH net] e1000e: Change wthresh to 1 to avoid possible Tx stalls.
From: Jeff Kirsher @ 2012-06-07  0:59 UTC (permalink / raw)
  To: Hiroaki SHIMODA, Eric Dumazet
  Cc: davem@davemloft.net, denys@visp.net.lb, eric.dumazet@gmail.com,
	therbert@google.com, netdev@vger.kernel.org
In-Reply-To: <20120606174355.823e9aa7.shimoda.hiroaki@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 970 bytes --]

On Wed, 2012-06-06 at 01:43 -0700, Hiroaki SHIMODA wrote:
> Denys Fedoryshchenko reported Tx stalls on e1000e with BQL enabled.
> 
> e1000e has WTHRESH which determines when Tx descripters are written
> back and successive Tx interrupts are generated, and setting WTHRESH
> to 5 gives efficient bus utilization but this cause possible Tx
> stalls,
> especially on BQL enabled system.
> 
> To avoid possible Tx stalls, change WTHRESH to 1.
> 
> Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb>
> Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
> ---
>  drivers/net/ethernet/intel/e1000e/e1000.h  |    6 +++---
>  drivers/net/ethernet/intel/e1000e/netdev.c |    2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-) 

After further internal review, NACK.

This patch will cause unacceptable performance issues with non-ESB2
parts.

I am dropping this patch from my queue.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: tcp wifi upload performance and lots of ACKs
From: Ben Greear @ 2012-06-07  0:40 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: netdev
In-Reply-To: <4FCFF540.5010608@candelatech.com>

On 06/06/2012 05:26 PM, Ben Greear wrote:
> On 06/04/2012 12:22 PM, Daniel Baluta wrote:
>
>> Currently, there is no way to tune these parameters. Here is an experimental
>> patch [1]. If anyone, thinks that this patch has a chance to get accepted
>> I will be happily try to further improve it.
>>
>>>
>>> Packet traces and other info available if anyone wants to take a look.
>>
>> thanks,
>> Daniel.
>>
>> [1] http://marc.info/?l=linux-netdev&m=131983649130350&w=2
>
> I tried your patch in 3.5.0-rc1.
>
> Doesn't seem to help any...do you have suggested settings for
> the proc values? (I tried increasing tcp_delack_segs up to 10,
> but no significant increase...)
>
> I'll keep poking around.

Ahh, forcing send-buffer to be large helps significantly..wonder
if that was my original problem.

Will get some more useful results tomorrow...

Thanks,
Ben

>
> Thanks,
> Ben
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: tcp wifi upload performance and lots of ACKs
From: Ben Greear @ 2012-06-07  0:26 UTC (permalink / raw)
  To: Daniel Baluta; +Cc: netdev
In-Reply-To: <CAEnQRZCNUYmP88Ocm_nG7gpA1Qcwy1tOc6kgCgZ7RqXcxQsHhg@mail.gmail.com>

On 06/04/2012 12:22 PM, Daniel Baluta wrote:

> Currently, there is no way to tune these parameters. Here is an experimental
> patch [1]. If anyone, thinks that this patch has a chance to get accepted
> I will be happily try to further improve it.
>
>>
>> Packet traces and other info available if anyone wants to take a look.
>
> thanks,
> Daniel.
>
> [1] http://marc.info/?l=linux-netdev&m=131983649130350&w=2

I tried your patch in 3.5.0-rc1.

Doesn't seem to help any...do you have suggested settings for
the proc values?  (I tried increasing tcp_delack_segs up to 10,
but no significant increase...)

I'll keep poking around.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH 5/7] drivers/net/wireless/brcm80211/brcmfmac/bcmsdh.c: adjust suspicious bit operation
From: Franky Lin @ 2012-06-06 23:43 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Brett Rudley, kernel-janitors-u79uwXL29TY76Z2rM5mHXA,
	Roland Vossen, Arend van Spriel, Kan Yan, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, joe-6d6DIl74uiNBDgjK7y7TUQ,
	Julia Lawall
In-Reply-To: <1339018901-28439-6-git-send-email-Julia.Lawall-L2FTfq7BK8M@public.gmane.org>

On 06/06/2012 02:41 PM, Julia Lawall wrote:
> From: Julia Lawall<Julia.Lawall-L2FTfq7BK8M@public.gmane.org>
>
> IRQF_TRIGGER_HIGH is 0x00000004, so it seems that&  was intended rather than |.
>
> This problem was found using Coccinelle (http://coccinelle.lip6.fr/).
>
> Signed-off-by: Julia Lawall<julia-dAYI7NvHqcQ@public.gmane.org>

Thanks, Julia. But this has already been fixed by Joe Perches [1] and 
the patch has arrived at Linux wireless tree.

Franky

[1] https://lkml.org/lkml/2012/5/30/482

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox