Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH BUG-FIX] ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU field less than  IPV6_MIN_MTU
From: Shan Wei @ 2010-04-19  6:49 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Miller, yoshfuji@linux-ipv6.org >> YOSHIFUJI Hideaki,
	魏勇军, vladislav.yasevich, kuznet, pekkas,
	jmorris, Patrick McHardy, eric.dumazet, sri,
	netdev@vger.kernel.org, linux-sctp
In-Reply-To: <20100419035535.GA7011@gondor.apana.org.au>

Herbert Xu wrote, at 04/19/2010 11:55 AM:
> 
> The patch looks good to me.

Thanks for reviewing this patch.

> If we wanted to optimise the allfrags case it may be better
> to reserve the space beforehand and generate the fragment header
> at the same time as we're doing the IPv6 header.
> 
> But it can't be all that important as it's been broken for so
> many years.

If somebody needs one patch to fix the broken,
I am pleased to do so. 

-- 
Best Regards
-----
Shan Wei


> 
> Thanks,



^ permalink raw reply

* ARP updates and GARP
From: Sasha Levin @ 2010-04-19  6:54 UTC (permalink / raw)
  To: netdev@vger.kernel.org

Hi,

We are currently testing IP fail-over on storage devices, and have observed an issue with the IP transfer from one device to another.

Assuming we have 2 storage devices A and B, and a server C which uses the storage, the scenario is:

1. Device A sends an ARP request which server C sees – server C updates it’s ARP table with the MAC of device A.
2. Device A fails, Device B takes over the IP and sends out a GARP.
3. Even though device C sees the GARP, it ignores it and keeps trying to communicate with device A until the entry is removed from its cache and a new ARP request is generated.

The code which causes this is located in arp_process@/net/ipv4/arp.c:

override = time_after(jiffies, n->updated + n->parms->locktime);

/* Broadcast replies and request packets
   do not assert neighbour reachability.
 */
if (arp->ar_op != htons(ARPOP_REPLY) ||
    skb->pkt_type != PACKET_HOST)
        state = NUD_STALE;
neigh_update(n, sha, state, override ? NEIGH_UPDATE_F_OVERRIDE : 0);
neigh_release(n);

According to the code, this scenario happens because the kernel ignores any ARP updates which happened in a short period after the previous ARP update. The reason which was stated in the comments is  “If several different ARP replies follows back-to-back, use the FIRST one. It is possible, if several proxy agents are active. Taking the first reply prevents arp trashing and chooses the fastest router.”.

This, however, doesn’t take into account GARPs which are not being sent by ARP proxies anyway and just ignores them too – causing a loss of communication for over a minute until the ARP cache refreshes.

Is there another reason for this rule? If not, Is possible to submit a patch which will take GARPs into account when ignoring ARP updates?


Thanks!

Sasha.

^ permalink raw reply

* Re: [PATCH] ks8842: Add module param for setting mac address
From: David Miller @ 2010-04-19  7:13 UTC (permalink / raw)
  To: richard.rojfors; +Cc: netdev
In-Reply-To: <4BCBFC62.8010200@pelagicore.com>

From: Richard Röjfors <richard.rojfors@pelagicore.com>
Date: Mon, 19 Apr 2010 08:46:58 +0200

> On 04/19/2010 08:36 AM, David Miller wrote:
>> From: Richard Röjfors<richard.rojfors@pelagicore.com>
>> Date: Mon, 19 Apr 2010 08:16:29 +0200
>>
>>> I posted a new patch where the MAC address is passed via
>>> platform data, hope that's an accepted way.
>>
>> I saw also that you mentioned that there is no real
>> way to probe this information.
>>
>> Therefore I worry that your plan might be to simply
>> use some kernel command line or module option somewhere
>> else to fill in this platform_device value.
>>
>> Is that what you plan to do?
> 
> In the lab; yes.
> In the end, there will be a flash available to store
> system wide parameters.

Ok, your long term plan is fine I guess.

^ permalink raw reply

* Re: Phylib polling when doing mdio_read will cause system response and transfer speed drop
From: Wolfram Sang @ 2010-04-19  7:21 UTC (permalink / raw)
  To: Bryan Wu; +Cc: afleming, davem, netdev, LKML
In-Reply-To: <4BC50BF5.7080700@canonical.com>

[-- Attachment #1: Type: text/plain, Size: 1412 bytes --]

On Tue, Apr 13, 2010 at 05:27:33PM -0700, Bryan Wu wrote:

> I found the root cause is the polling operation in the mdio_read 
> function. When we transfer large files, we experienced many times of 
> timeout issue. So I got several question here:

Same here, I saw the 'MDIO Timeout' Message occasionally.

> 1. Need I return -ETIMEDOUT when polling timeout. If I don't return 
> -ETIMEOUT, the performance improved a lot. And after check other drivers, 
> some don't return anything, some return 0, some return negative value. 
> What's the rule for this mdio_read polling timeout case.
>
> 2. How to do polling busy waiting? Normally, we won't buys wait very long 
> in polling. But hardware is not perfect every time. Running cpu_relax() 
> 10000 times in polling will cause our system response very bad when 
> hardware don't set the flag as we expected. Maybe udelay(25) 10 times or 
> msleep(1) 10 times is better than that.
>
> I got a patch to recover this issue,  
> http://kernel.ubuntu.com/git?p=roc/ubuntu-lucid.git;a=commitdiff;h=5d77e3409b319ca84183bf1d2fd158a9c864e03f.

Can't help with the details, but the patch seems to help here, too
(and not only because of the removed printk ;)).

Regards,

   Wolfram

-- 
Pengutronix e.K.                           | Wolfram Sang                |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* Re: [PATCH RFC]: soreuseport: Bind multiple sockets to same port
From: Eric Dumazet @ 2010-04-19  7:28 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.1.00.1004182321480.1822@pokey.mtv.corp.google.com>

Le dimanche 18 avril 2010 à 23:33 -0700, Tom Herbert a écrit :
> This is some work we've done to scale TCP listeners/UDP servers.  It
> might be apropos with some of the discussion on SO_REUSEADDR for UDP.
> ---
> This patch implements so_reuseport (SO_REUSEPORT socket option) for
> TCP and UDP.  For TCP, so_reuseport allows multiple listener sockets
> to be bound to the same port.  In the case of UDP, so_reuseport allows
> multiple sockets to bind to the same port.  To prevent port hijacking
> all sockets bound to the same port using so_reuseport must have the
> same uid.  Received packets are distributed to multiple sockets bound
> to the same port using a 4-tuple hash.
> 
> The motivating case for so_resuseport in TCP would be something like
> a web server binding to port 80 running with multiple threads, where
> each thread might have it's own listener socket.  This could be done
> as an alternative to other models: 1) have one listener thread which
> dispatches completed connections to workers. 2) accept on a single
> listener socket from multiple threads.  In case #1 the listener thread
> can easily become the bottleneck with high connection turn-over rate.
> In case #2, the proportion of connections accepted per thread tends
> to be uneven under high connection load (assuming simple event loop:
> while (1) { accept(); process() }, wakeup does not promote fairness
> among the sockets.  We have seen the  disproportion to be as high
> as 3:1 ratio between thread accepting most connections and the one
> accepting the fewest.  With so_reusport the distribution is
> uniform.
> 
> The TCP implementation has a problem in that the request sockets for a
> listener are attached to a listener socket.  If a SYN is received, a
> listener socket is chosen and request structure is created (SYN-RECV
> state).  If the subsequent ack in 3WHS does not match the same port
> by so_reusport, the connection state is not found (reset) and the
> request structure is orphaned.  This scenario would occur when the
> number of listener sockets bound to a port changes (new ones are
> added, or old ones closed).  We are looking for a solution to this,
> maybe allow multiple sockets to share the same request table...
> 
> The motivating case for so_reuseport in UDP would be something like a
> DNS server.  An alternative would be to recv on the same socket from
> multiple threads.  As in the case of TCP, the load across these threads
> tends to be disproportionate and we also see a lot of contection on
> the socket lock.  Note that SO_REUSEADDR already allows multiple UDP
> sockets to bind to the same port, however there is no provision to
> prevent hijacking and nothing to distribute packets across all the
> sockets sharing the same bound port.  This patch does not change the
> semantics of SO_REUSEADDR, but provides usable functionality of it
> for unicast.


Hmm...

I am wondering how this thing is scalable...

In fact it is not.

We live in a world with 16 cpus machines not uncommon right now.

High perf DNS server on such machine would have 16 threads, and probably
64 threads in two years.

I understand you want 16 UDP sockets to avoid lock contention, but
__udp4_lib_lookup() becomes a nightmare (It may already be ...)

My idea was to add a cpu lookup key.

thread0 would use a new setsockopt() option to bind a socket to a
virtual cpu0. Then do its normal bind( port=53)

...

threadN would use a new setsockopt() option to bind a socket to a
virtual cpuN. Then do its normal bind( port=53)

Each thread then do its normal worker loop.

Then, when receiving a frame on cpuN, we would automatically select the
right socket because its score is higher than others.


Another possibility would be to extend socket structure to be able to
have a dynamically sized queues/locks.




^ permalink raw reply

* [GIT]: Networking
From: David Miller @ 2010-04-19  7:38 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Fix TX lockups in forcedeth due to an incorrect chipset ID check
   wrt. whether to enable a TX hw bug workaround or not.  Fix from
   Ayaz Abdulla.

2) Fix some virtualization problems by orphan'ing the SKB on TX
   in tun driver.  From Michael S. Tsirkin.

3) Some minor fallout from the slab.h cleanups in gigaset driver,
   from Tilman Schmidt.

4) hdlc_ppp can crash on rmmod due to lack of PPP tx queue flush,
   fix from Krzysztof Halasa.

5) Three fixes from Eric Dumazet:
   a) ip_dev_loopback_xmit() needs to use netif_rx_ni() since it sometimes
      is invoked from user context and therefore an explicit check and
      run of softirqs is necessary.
   b) dev_pic_tx() should not cache a socket TX queue selection unless
      the socket cache'd dst matches the one currently hung off of the
      skb
   c) Fix lockdep false positives in fib_trie

6) AF_PACKET erroneously restricts to init_net in some ioctls, fix
   from Daniel Lezcano.

7) iwlwifi active chain detection fix from Johannes Berg.

Please pull, thanks a lot!

The following changes since commit 13bd8e4673d527a9e48f41956b11d391e7c2cfe0:
  Linus Torvalds (1):
        Merge branch 'for-linus' of git://git.kernel.org/.../anholt/drm-intel

are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Ayaz Abdulla (1):
      forcedeth: fix tx limit2 flag check

Daniel Lezcano (1):
      packet : remove init_net restriction

David S. Miller (1):
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6

Eric Dumazet (3):
      fib: suppress lockdep-RCU false positive in FIB trie.
      net: dev_pick_tx() fix
      ip: Fix ip_dev_loopback_xmit()

Johannes Berg (1):
      iwlwifi: work around bogus active chains detection

Krzysztof Halasa (1):
      WAN: flush tx_queue in hdlc_ppp to prevent panic on rmmod hw_driver.

Michael S. Tsirkin (1):
      tun: orphan an skb on tx

Tilman Schmidt (1):
      gigaset: include cleanup cleanup

 drivers/isdn/gigaset/bas-gigaset.c       |    5 -----
 drivers/isdn/gigaset/capi.c              |    2 --
 drivers/isdn/gigaset/common.c            |    2 --
 drivers/isdn/gigaset/gigaset.h           |    2 +-
 drivers/isdn/gigaset/i4l.c               |    1 -
 drivers/isdn/gigaset/interface.c         |    1 -
 drivers/isdn/gigaset/proc.c              |    1 -
 drivers/isdn/gigaset/ser-gigaset.c       |    3 ---
 drivers/isdn/gigaset/usb-gigaset.c       |    4 ----
 drivers/net/forcedeth.c                  |    2 +-
 drivers/net/tun.c                        |    4 ++++
 drivers/net/wan/hdlc_ppp.c               |    6 ++++++
 drivers/net/wireless/iwlwifi/iwl-calib.c |   12 ++++++++++++
 net/core/dev.c                           |    8 ++++++--
 net/ipv4/fib_trie.c                      |    4 +++-
 net/ipv4/ip_output.c                     |    2 +-
 net/ipv6/ip6_output.c                    |    2 +-
 net/packet/af_packet.c                   |    2 --
 18 files changed, 35 insertions(+), 28 deletions(-)

^ permalink raw reply

* [RFC] rps: shortcut net_rps_action()
From: Eric Dumazet @ 2010-04-19  9:37 UTC (permalink / raw)
  To: Tom Herbert, David Miller; +Cc: netdev
In-Reply-To: <1271590476.16881.4925.camel@edumazet-laptop>

net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if
RPS is not active.

I add a flag to scan cpumask only if at least one IPI was scheduled.
Even cpumask_weight() might be expensive on some setups, where
nr_cpumask_bits could be very big (4096 for example)

Move all RPS logic into net_rps_action() to cleanup net_rx_action() code
(remove two ifdefs)

Move rps_remote_softirq_cpus into softnet_data to share its first cache
line, filling an existing hole.

In a future patch, we could call net_rps_action() from process_backlog()
to make sure we send IPI before handling this cpu backlog.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/linux/netdevice.h |    5 +-
 net/core/dev.c            |   73 ++++++++++++++++--------------------
 2 files changed, 38 insertions(+), 40 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 649a025..283d3ef 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1389,8 +1389,11 @@ struct softnet_data {
 	struct list_head	poll_list;
 	struct sk_buff		*completion_queue;
 
-	/* Elements below can be accessed between CPUs for RPS */
 #ifdef CONFIG_RPS
+	unsigned int		rps_ipis_scheduled;
+	unsigned int		rps_select;
+	cpumask_t		rps_mask[2];
+	/* Elements below can be accessed between CPUs for RPS */
 	struct call_single_data	csd ____cacheline_aligned_in_smp;
 	unsigned int		input_queue_head;
 #endif
diff --git a/net/core/dev.c b/net/core/dev.c
index 7abf959..3e6e420 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2347,19 +2347,14 @@ done:
 }
 
 /*
- * This structure holds the per-CPU mask of CPUs for which IPIs are scheduled
+ * sofnet_data holds the per-CPU mask of CPUs for which IPIs are scheduled
  * to be sent to kick remote softirq processing.  There are two masks since
- * the sending of IPIs must be done with interrupts enabled.  The select field
+ * the sending of IPIs must be done with interrupts enabled.  The rps_select field
  * indicates the current mask that enqueue_backlog uses to schedule IPIs.
  * select is flipped before net_rps_action is called while still under lock,
  * net_rps_action then uses the non-selected mask to send the IPIs and clears
  * it without conflicting with enqueue_backlog operation.
  */
-struct rps_remote_softirq_cpus {
-	cpumask_t mask[2];
-	int select;
-};
-static DEFINE_PER_CPU(struct rps_remote_softirq_cpus, rps_remote_softirq_cpus);
 
 /* Called from hardirq (IPI) context */
 static void trigger_softirq(void *data)
@@ -2403,10 +2398,10 @@ enqueue:
 		if (napi_schedule_prep(&queue->backlog)) {
 #ifdef CONFIG_RPS
 			if (cpu != smp_processor_id()) {
-				struct rps_remote_softirq_cpus *rcpus =
-				    &__get_cpu_var(rps_remote_softirq_cpus);
+				struct softnet_data *myqueue = &__get_cpu_var(softnet_data);
 
-				cpu_set(cpu, rcpus->mask[rcpus->select]);
+				cpu_set(cpu, myqueue->rps_mask[myqueue->rps_select]);
+				myqueue->rps_ipis_scheduled = 1;
 				__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 				goto enqueue;
 			}
@@ -2911,7 +2906,9 @@ int netif_receive_skb(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(netif_receive_skb);
 
-/* Network device is going away, flush any packets still pending  */
+/* Network device is going away, flush any packets still pending
+ * Called with irqs disabled.
+ */
 static void flush_backlog(void *arg)
 {
 	struct net_device *dev = arg;
@@ -3340,24 +3337,36 @@ void netif_napi_del(struct napi_struct *napi)
 }
 EXPORT_SYMBOL(netif_napi_del);
 
-#ifdef CONFIG_RPS
 /*
- * net_rps_action sends any pending IPI's for rps.  This is only called from
- * softirq and interrupts must be enabled.
+ * net_rps_action sends any pending IPI's for rps.
+ * Note: called with local irq disabled, but exits with local irq enabled.
  */
-static void net_rps_action(cpumask_t *mask)
+static void net_rps_action(void)
 {
-	int cpu;
+#ifdef CONFIG_RPS
+	if (percpu_read(softnet_data.rps_ipis_scheduled)) {
+		struct softnet_data *queue = &__get_cpu_var(softnet_data);
+		int cpu, select = queue->rps_select;
+		cpumask_t *mask;
+		
+		queue->rps_ipis_scheduled = 0;
+		queue->rps_select ^= 1;
 
-	/* Send pending IPI's to kick RPS processing on remote cpus. */
-	for_each_cpu_mask_nr(cpu, *mask) {
-		struct softnet_data *queue = &per_cpu(softnet_data, cpu);
-		if (cpu_online(cpu))
-			__smp_call_function_single(cpu, &queue->csd, 0);
-	}
-	cpus_clear(*mask);
-}
+		local_irq_enable();
+
+		mask = &queue->rps_mask[select];
+
+		/* Send pending IPI's to kick RPS processing on remote cpus. */
+		for_each_cpu_mask_nr(cpu, *mask) {
+			struct softnet_data *remqueue = &per_cpu(softnet_data, cpu);
+			if (cpu_online(cpu))
+				__smp_call_function_single(cpu, &remqueue->csd, 0);
+		}
+		cpus_clear(*mask);
+	} else
 #endif
+		local_irq_enable();
+}
 
 static void net_rx_action(struct softirq_action *h)
 {
@@ -3365,10 +3374,6 @@ static void net_rx_action(struct softirq_action *h)
 	unsigned long time_limit = jiffies + 2;
 	int budget = netdev_budget;
 	void *have;
-#ifdef CONFIG_RPS
-	int select;
-	struct rps_remote_softirq_cpus *rcpus;
-#endif
 
 	local_irq_disable();
 
@@ -3431,17 +3436,7 @@ static void net_rx_action(struct softirq_action *h)
 		netpoll_poll_unlock(have);
 	}
 out:
-#ifdef CONFIG_RPS
-	rcpus = &__get_cpu_var(rps_remote_softirq_cpus);
-	select = rcpus->select;
-	rcpus->select ^= 1;
-
-	local_irq_enable();
-
-	net_rps_action(&rcpus->mask[select]);
-#else
-	local_irq_enable();
-#endif
+	net_rps_action();
 
 #ifdef CONFIG_NET_DMA
 	/*



^ permalink raw reply related

* Re: [RFC] rps: shortcut net_rps_action()
From: Changli Gao @ 2010-04-19  9:48 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, David Miller, netdev
In-Reply-To: <1271669822.16881.7520.camel@edumazet-laptop>

On Mon, Apr 19, 2010 at 5:37 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> net_rps_action() is a bit expensive on NR_CPUS=64..4096 kernels, even if
> RPS is not active.
>
> I add a flag to scan cpumask only if at least one IPI was scheduled.
> Even cpumask_weight() might be expensive on some setups, where
> nr_cpumask_bits could be very big (4096 for example)

How about using a array to save the cpu IDs. The number of CPUs, to
which the IPI will be sent, should be small.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* [PATCH net-2.6] cleanup: remove two unnecessary exports in skbuff.c.
From: Rami Rosen @ 2010-04-19  9:50 UTC (permalink / raw)
  To: davem, netdev

[-- Attachment #1: Type: text/plain, Size: 240 bytes --]

Hi,
There is no need to export skb_under_panic() and skb_over_panic() in
skbuff.c, since these methods are used only in
skbuff.c ; this patch removes these two exports.


Regards,
Rami Rosen


Signed-off-by: Rami Rosen <ramirose@gmail.com>

[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 666 bytes --]

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 93c4e06..799b89d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -126,7 +126,6 @@ void skb_over_panic(struct sk_buff *skb, int sz, void *here)
 	       skb->dev ? skb->dev->name : "<NULL>");
 	BUG();
 }
-EXPORT_SYMBOL(skb_over_panic);
 
 /**
  *	skb_under_panic	- 	private function
@@ -146,7 +145,6 @@ void skb_under_panic(struct sk_buff *skb, int sz, void *here)
 	       skb->dev ? skb->dev->name : "<NULL>");
 	BUG();
 }
-EXPORT_SYMBOL(skb_under_panic);
 
 /* 	Allocate a new skbuff. We do this ourselves so we can fill in a few
  *	'private' fields and also do memory statistics to find all the

^ permalink raw reply related

* RE: [RFC][PATCH v2 0/3] Provide a zero-copy method on KVM virtio-net.
From: Xin, Xiaohui @ 2010-04-19 10:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu,
	jdike@linux.intel.com, davem@davemloft.net
In-Reply-To: <20100415100546.GA17035@redhat.com>

> Michael,
> >>> The idea is simple, just to pin the guest VM user space and then
> >>> let host NIC driver has the chance to directly DMA to it. 
> >>> The patches are based on vhost-net backend driver. We add a device
> >>> which provides proto_ops as sendmsg/recvmsg to vhost-net to
> >>> send/recv directly to/from the NIC driver. KVM guest who use the
> >>> vhost-net backend may bind any ethX interface in the host side to
> >>> get copyless data transfer thru guest virtio-net frontend.
> >>> 
> >>> The scenario is like this:
> >>> 
> >>> The guest virtio-net driver submits multiple requests thru vhost-net
> >>> backend driver to the kernel. And the requests are queued and then
> >>> completed after corresponding actions in h/w are done.
> >>> 
> >>> For read, user space buffers are dispensed to NIC driver for rx when
> >>> a page constructor API is invoked. Means NICs can allocate user buffers
> >>> from a page constructor. We add a hook in netif_receive_skb() function
> >>> to intercept the incoming packets, and notify the zero-copy device.
> >>> 
> >>> For write, the zero-copy deivce may allocates a new host skb and puts
> >>> payload on the skb_shinfo(skb)->frags, and copied the header to skb->data.
> >>> The request remains pending until the skb is transmitted by h/w.
> >>> 
> >>> Here, we have ever considered 2 ways to utilize the page constructor
> >>> API to dispense the user buffers.
> >>> 
> >>> One:	Modify __alloc_skb() function a bit, it can only allocate a 
> >>> 	structure of sk_buff, and the data pointer is pointing to a 
> >>> 	user buffer which is coming from a page constructor API.
> >>> 	Then the shinfo of the skb is also from guest.
> >>> 	When packet is received from hardware, the skb->data is filled
> >>> 	directly by h/w. What we have done is in this way.
> >>> 
> >>> 	Pros:	We can avoid any copy here.
> >>> 	Cons:	Guest virtio-net driver needs to allocate skb as almost
> >>> 		the same method with the host NIC drivers, say the size
> >>> 		of netdev_alloc_skb() and the same reserved space in the
> >>> 		head of skb. Many NIC drivers are the same with guest and
> >>> 		ok for this. But some lastest NIC drivers reserves special
> >>> 		room in skb head. To deal with it, we suggest to provide
> >>> 		a method in guest virtio-net driver to ask for parameter
> >>> 		we interest from the NIC driver when we know which device 
> >>> 		we have bind to do zero-copy. Then we ask guest to do so.
> >>> 		Is that reasonable?
> >>Unfortunately, this would break compatibility with existing virtio.
> >>This also complicates migration.  
>> You mean any modification to the guest virtio-net driver will break the
>> compatibility? We tried to enlarge the virtio_net_config to contains the
>> 2 parameter, and add one VIRTIO_NET_F_PASSTHRU flag, virtionet_probe()
>> will check the feature flag, and get the parameters, then virtio-net driver use
>> it to allocate buffers. How about this?

>This means that we can't, for example, live-migrate between different systems
>without flushing outstanding buffers.

Ok. What we have thought about now is to do something with skb_reserve().
If the device is binded by mp, then skb_reserve() will do nothing with it.

> >>What is the room in skb head used for?
> >I'm not sure, but the latest ixgbe driver does this, it reserves 32 bytes compared to
>> NET_IP_ALIGN.

>Looking at code, this seems to do with alignment - could just be
>a performance optimization.

> >>> Two:	Modify driver to get user buffer allocated from a page constructor
> >>> 	API(to substitute alloc_page()), the user buffer are used as payload
> >>> 	buffers and filled by h/w directly when packet is received. Driver
> >>> 	should associate the pages with skb (skb_shinfo(skb)->frags). For 
> >>> 	the head buffer side, let host allocates skb, and h/w fills it. 
> >>> 	After that, the data filled in host skb header will be copied into
> >>> 	guest header buffer which is submitted together with the payload buffer.
> >>> 
> >>> 	Pros:	We could less care the way how guest or host allocates their
> >>> 		buffers.
> >>> 	Cons:	We still need a bit copy here for the skb header.
> >>> 
> >>> We are not sure which way is the better here. 
> >>The obvious question would be whether you see any speed difference
> >>with the two approaches. If no, then the second approach would be
> >>better.
> 
>> I remember the second approach is a bit slower in 1500MTU. 
>> But we did not tested too much.

>Well, that's an important datapoint. By the way, you'll need
>header copy to activate LRO in host, so that's a good
>reason to go with option 2 as well.


> >>> This is the first thing we want
> >>> to get comments from the community. We wish the modification to the network
> >>> part will be generic which not used by vhost-net backend only, but a user
> >>> application may use it as well when the zero-copy device may provides async
> >>> read/write operations later.
> >>> 
> >>> Please give comments especially for the network part modifications.
> >>> 
> >>> 
> >>> We provide multiple submits and asynchronous notifiicaton to 
> >>>vhost-net too.
> >>> 
> >>> Our goal is to improve the bandwidth and reduce the CPU usage.
> >>> Exact performance data will be provided later. But for simple
> >>> test with netperf, we found bindwidth up and CPU % up too,
> >>> but the bindwidth up ratio is much more than CPU % up ratio.
> >>> 
> >>> What we have not done yet:
> >>> 	packet split support
> 
> >>What does this mean, exactly?
>> We can support 1500MTU, but for jumbo frame, since vhost driver before don't 
> >support mergeable buffer, we cannot try it for multiple sg.

>I do not see why, vhost currently supports 64K buffers with indirect
>descriptors.

The receive_skb() in guest virtio-net driver will merge the multiple sg to skb frags, how can indirect descriptors to that?

>>> A jumbo frame will split 5
>>> frags and hook them once a descriptor, so the user buffer allocation is greatly dependent
>>> on how guest virtio-net drivers submits buffers. We think mergeable buffer is suitable for >>>it. 
> 
> >> 	To support GRO
>>> Actually, I think if the mergeable buffer may get good performance, then GRO is not 
>>> so important then.
> >>And TSO/GSO?
>>> Do we really need them?

>>My guess would be yes. Mergeable buffers is a memory saving
>>optimization, not a performance optimization, I don't see
>>that it can help. And I think you can't solely rely on jumbo frames
>>in hardware, not everyone can enable them.

>Having said that, number one priority is getting decent performance
>out of the driver, in whatever way you find fit. I was just
>suggesting obvious ways to do this.

Thanks.

> >> 	Performance tuning
> >> 
> >> what we have done in v1:
> >> 	polish the RCU usage
> >> 	deal with write logging in asynchroush mode in vhost
> >> 	add notifier block for mp device
> >> 	rename page_ctor to mp_port in netdevice.h to make it looks generic
> >> 	add mp_dev_change_flags() for mp device to change NIC state
> >> 	add CONIFG_VHOST_MPASSTHRU to limit the usage when module is not load
> >> 	a small fix for missing dev_put when fail
> >> 	using dynamic minor instead of static minor number
> >> 	a __KERNEL__ protect to mp_get_sock()
> >> 
> >> what we have done in v2:
> >> 	
> >> 	remove most of the RCU usage, since the ctor pointer is only
> >> 	changed by BIND/UNBIND ioctl, and during that time, NIC will be
> >> 	stopped to get good cleanup(all outstanding requests are finished),
> >> 	so the ctor pointer cannot be raced into wrong situation.
> >> 
> >> 	Remove the struct vhost_notifier with struct kiocb.
> >> 	Let vhost-net backend to alloc/free the kiocb and transfer them
> >> 	via sendmsg/recvmsg.
> >> 
> >> 	use get_user_pages_fast() and set_page_dirty_lock() when read.
> >> 
> >> 	Add some comments for netdev_mp_port_prep() and handle_mpassthru().
> >> 
> >> 
> >> Comments not addressed yet in this time:
> >> 	the async write logging is not satified by vhost-net
> >> 	Qemu needs a sync write
> >> 	a limit for locked pages from get_user_pages_fast()
> >> 	
> >> 		
> >> performance:
> >> 	using netperf with GSO/TSO disabled, 10G NIC, 
> >> 	disabled packet split mode, with raw socket case compared to vhost.
> >> 
> >> 	bindwidth will be from 1.1Gbps to 1.7Gbps
> >> 	CPU % from 120%-140% to 140%-160%

^ permalink raw reply

* Re: [RFC][PATCH v2 0/3] Provide a zero-copy method on KVM virtio-net.
From: Michael S. Tsirkin @ 2010-04-19 10:21 UTC (permalink / raw)
  To: Xin, Xiaohui
  Cc: netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu,
	jdike@linux.intel.com, davem@davemloft.net
In-Reply-To: <F2E9EB7348B8264F86B6AB8151CE2D79026FA95401@shsmsx502.ccr.corp.intel.com>

On Mon, Apr 19, 2010 at 06:05:17PM +0800, Xin, Xiaohui wrote:
> > Michael,
> > >>> The idea is simple, just to pin the guest VM user space and then
> > >>> let host NIC driver has the chance to directly DMA to it. 
> > >>> The patches are based on vhost-net backend driver. We add a device
> > >>> which provides proto_ops as sendmsg/recvmsg to vhost-net to
> > >>> send/recv directly to/from the NIC driver. KVM guest who use the
> > >>> vhost-net backend may bind any ethX interface in the host side to
> > >>> get copyless data transfer thru guest virtio-net frontend.
> > >>> 
> > >>> The scenario is like this:
> > >>> 
> > >>> The guest virtio-net driver submits multiple requests thru vhost-net
> > >>> backend driver to the kernel. And the requests are queued and then
> > >>> completed after corresponding actions in h/w are done.
> > >>> 
> > >>> For read, user space buffers are dispensed to NIC driver for rx when
> > >>> a page constructor API is invoked. Means NICs can allocate user buffers
> > >>> from a page constructor. We add a hook in netif_receive_skb() function
> > >>> to intercept the incoming packets, and notify the zero-copy device.
> > >>> 
> > >>> For write, the zero-copy deivce may allocates a new host skb and puts
> > >>> payload on the skb_shinfo(skb)->frags, and copied the header to skb->data.
> > >>> The request remains pending until the skb is transmitted by h/w.
> > >>> 
> > >>> Here, we have ever considered 2 ways to utilize the page constructor
> > >>> API to dispense the user buffers.
> > >>> 
> > >>> One:	Modify __alloc_skb() function a bit, it can only allocate a 
> > >>> 	structure of sk_buff, and the data pointer is pointing to a 
> > >>> 	user buffer which is coming from a page constructor API.
> > >>> 	Then the shinfo of the skb is also from guest.
> > >>> 	When packet is received from hardware, the skb->data is filled
> > >>> 	directly by h/w. What we have done is in this way.
> > >>> 
> > >>> 	Pros:	We can avoid any copy here.
> > >>> 	Cons:	Guest virtio-net driver needs to allocate skb as almost
> > >>> 		the same method with the host NIC drivers, say the size
> > >>> 		of netdev_alloc_skb() and the same reserved space in the
> > >>> 		head of skb. Many NIC drivers are the same with guest and
> > >>> 		ok for this. But some lastest NIC drivers reserves special
> > >>> 		room in skb head. To deal with it, we suggest to provide
> > >>> 		a method in guest virtio-net driver to ask for parameter
> > >>> 		we interest from the NIC driver when we know which device 
> > >>> 		we have bind to do zero-copy. Then we ask guest to do so.
> > >>> 		Is that reasonable?
> > >>Unfortunately, this would break compatibility with existing virtio.
> > >>This also complicates migration.  
> >> You mean any modification to the guest virtio-net driver will break the
> >> compatibility? We tried to enlarge the virtio_net_config to contains the
> >> 2 parameter, and add one VIRTIO_NET_F_PASSTHRU flag, virtionet_probe()
> >> will check the feature flag, and get the parameters, then virtio-net driver use
> >> it to allocate buffers. How about this?
> 
> >This means that we can't, for example, live-migrate between different systems
> >without flushing outstanding buffers.
> 
> Ok. What we have thought about now is to do something with skb_reserve().
> If the device is binded by mp, then skb_reserve() will do nothing with it.
> 
> > >>What is the room in skb head used for?
> > >I'm not sure, but the latest ixgbe driver does this, it reserves 32 bytes compared to
> >> NET_IP_ALIGN.
> 
> >Looking at code, this seems to do with alignment - could just be
> >a performance optimization.
> 
> > >>> Two:	Modify driver to get user buffer allocated from a page constructor
> > >>> 	API(to substitute alloc_page()), the user buffer are used as payload
> > >>> 	buffers and filled by h/w directly when packet is received. Driver
> > >>> 	should associate the pages with skb (skb_shinfo(skb)->frags). For 
> > >>> 	the head buffer side, let host allocates skb, and h/w fills it. 
> > >>> 	After that, the data filled in host skb header will be copied into
> > >>> 	guest header buffer which is submitted together with the payload buffer.
> > >>> 
> > >>> 	Pros:	We could less care the way how guest or host allocates their
> > >>> 		buffers.
> > >>> 	Cons:	We still need a bit copy here for the skb header.
> > >>> 
> > >>> We are not sure which way is the better here. 
> > >>The obvious question would be whether you see any speed difference
> > >>with the two approaches. If no, then the second approach would be
> > >>better.
> > 
> >> I remember the second approach is a bit slower in 1500MTU. 
> >> But we did not tested too much.
> 
> >Well, that's an important datapoint. By the way, you'll need
> >header copy to activate LRO in host, so that's a good
> >reason to go with option 2 as well.
> 
> 
> > >>> This is the first thing we want
> > >>> to get comments from the community. We wish the modification to the network
> > >>> part will be generic which not used by vhost-net backend only, but a user
> > >>> application may use it as well when the zero-copy device may provides async
> > >>> read/write operations later.
> > >>> 
> > >>> Please give comments especially for the network part modifications.
> > >>> 
> > >>> 
> > >>> We provide multiple submits and asynchronous notifiicaton to 
> > >>>vhost-net too.
> > >>> 
> > >>> Our goal is to improve the bandwidth and reduce the CPU usage.
> > >>> Exact performance data will be provided later. But for simple
> > >>> test with netperf, we found bindwidth up and CPU % up too,
> > >>> but the bindwidth up ratio is much more than CPU % up ratio.
> > >>> 
> > >>> What we have not done yet:
> > >>> 	packet split support
> > 
> > >>What does this mean, exactly?
> >> We can support 1500MTU, but for jumbo frame, since vhost driver before don't 
> > >support mergeable buffer, we cannot try it for multiple sg.
> 
> >I do not see why, vhost currently supports 64K buffers with indirect
> >descriptors.
> 
> The receive_skb() in guest virtio-net driver will merge the multiple sg to skb frags, how can indirect descriptors to that?

See add_recvbuf_big.

> >>> A jumbo frame will split 5
> >>> frags and hook them once a descriptor, so the user buffer allocation is greatly dependent
> >>> on how guest virtio-net drivers submits buffers. We think mergeable buffer is suitable for >>>it. 
> > 
> > >> 	To support GRO
> >>> Actually, I think if the mergeable buffer may get good performance, then GRO is not 
> >>> so important then.
> > >>And TSO/GSO?
> >>> Do we really need them?
> 
> >>My guess would be yes. Mergeable buffers is a memory saving
> >>optimization, not a performance optimization, I don't see
> >>that it can help. And I think you can't solely rely on jumbo frames
> >>in hardware, not everyone can enable them.
> 
> >Having said that, number one priority is getting decent performance
> >out of the driver, in whatever way you find fit. I was just
> >suggesting obvious ways to do this.
> 
> Thanks.
> 
> > >> 	Performance tuning
> > >> 
> > >> what we have done in v1:
> > >> 	polish the RCU usage
> > >> 	deal with write logging in asynchroush mode in vhost
> > >> 	add notifier block for mp device
> > >> 	rename page_ctor to mp_port in netdevice.h to make it looks generic
> > >> 	add mp_dev_change_flags() for mp device to change NIC state
> > >> 	add CONIFG_VHOST_MPASSTHRU to limit the usage when module is not load
> > >> 	a small fix for missing dev_put when fail
> > >> 	using dynamic minor instead of static minor number
> > >> 	a __KERNEL__ protect to mp_get_sock()
> > >> 
> > >> what we have done in v2:
> > >> 	
> > >> 	remove most of the RCU usage, since the ctor pointer is only
> > >> 	changed by BIND/UNBIND ioctl, and during that time, NIC will be
> > >> 	stopped to get good cleanup(all outstanding requests are finished),
> > >> 	so the ctor pointer cannot be raced into wrong situation.
> > >> 
> > >> 	Remove the struct vhost_notifier with struct kiocb.
> > >> 	Let vhost-net backend to alloc/free the kiocb and transfer them
> > >> 	via sendmsg/recvmsg.
> > >> 
> > >> 	use get_user_pages_fast() and set_page_dirty_lock() when read.
> > >> 
> > >> 	Add some comments for netdev_mp_port_prep() and handle_mpassthru().
> > >> 
> > >> 
> > >> Comments not addressed yet in this time:
> > >> 	the async write logging is not satified by vhost-net
> > >> 	Qemu needs a sync write
> > >> 	a limit for locked pages from get_user_pages_fast()
> > >> 	
> > >> 		
> > >> performance:
> > >> 	using netperf with GSO/TSO disabled, 10G NIC, 
> > >> 	disabled packet split mode, with raw socket case compared to vhost.
> > >> 
> > >> 	bindwidth will be from 1.1Gbps to 1.7Gbps
> > >> 	CPU % from 120%-140% to 140%-160%

^ permalink raw reply

* [PATCH net-next-2.6 0/11] bnx2x: Bug fixes and enhancements
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Hi Dave,

Resubmitting this patch series with the following changes:
- patch 2 VPD: using generic infrastructures for VPD access
- omitting patch 11 (disabling LRO and leaving only GRO on XEN kernel at
compile time)
- omitting patch 10: changing the default queues to 1 on 32 bits system.

We are looking into using kmalloc() instead of vmalloc() to overcome the
issue on 32 bits kernels.

Please consider applying these changes to net-next.

Thanks,
vlad




^ permalink raw reply

* [PATCH net-next-2.6 1/11] bnx2x: Parity errors handling for 57710 and 57711
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

This patch introduces the parity errors handling code for 57710 and 57711 chips.

HW is configured to stop all DMA transactions to the host and sending packets to the network
once parity error is detected, which is meant to prevent silent data corruption.
At the same time HW generates the attention interrupt to every function of the device where parity
has been detected so that driver can start the recovery flow.

The recovery is actually resetting the chip and restarting the driver on all active functions
of the chip where the parity error has been reported.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |   23 +-
 drivers/net/bnx2x_main.c | 1045 +++++++++++++++++++++++++++++++++++++++++++---
 drivers/net/bnx2x_reg.h  |   27 ++-
 3 files changed, 1039 insertions(+), 56 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index ae9c89e..37e9b16 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -155,9 +155,15 @@ do {								 \
 #define SHMEM2_RD(bp, field)		REG_RD(bp, SHMEM2_ADDR(bp, field))
 #define SHMEM2_WR(bp, field, val)	REG_WR(bp, SHMEM2_ADDR(bp, field), val)
 
+#define MF_CFG_RD(bp, field)		SHMEM_RD(bp, mf_cfg.field)
+#define MF_CFG_WR(bp, field, val)	SHMEM_WR(bp, mf_cfg.field, val)
+
 #define EMAC_RD(bp, reg)		REG_RD(bp, emac_base + reg)
 #define EMAC_WR(bp, reg, val)		REG_WR(bp, emac_base + reg, val)
 
+#define AEU_IN_ATTN_BITS_PXPPCICLOCKCLIENT_PARITY_ERROR \
+	AEU_INPUTS_ATTN_BITS_PXPPCICLOCKCLIENT_PARITY_ERROR
+
 
 /* fast path */
 
@@ -818,6 +824,12 @@ struct attn_route {
 	u32	sig[4];
 };
 
+typedef enum {
+	BNX2X_RECOVERY_DONE,
+	BNX2X_RECOVERY_INIT,
+	BNX2X_RECOVERY_WAIT,
+} bnx2x_recovery_state_t;
+
 struct bnx2x {
 	/* Fields used in the tx and intr/napi performance paths
 	 * are grouped together in the beginning of the structure
@@ -835,6 +847,9 @@ struct bnx2x {
 	struct pci_dev		*pdev;
 
 	atomic_t		intr_sem;
+
+	bnx2x_recovery_state_t	recovery_state;
+	int			is_leader;
 #ifdef BCM_CNIC
 	struct msix_entry	msix_table[MAX_CONTEXT+2];
 #else
@@ -924,8 +939,7 @@ struct bnx2x {
 	int			mrrs;
 
 	struct delayed_work	sp_task;
-	struct work_struct	reset_task;
-
+	struct delayed_work	reset_task;
 	struct timer_list	timer;
 	int			current_interval;
 
@@ -1125,6 +1139,7 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 #define LOAD_DIAG			2
 #define UNLOAD_NORMAL			0
 #define UNLOAD_CLOSE			1
+#define UNLOAD_RECOVERY                 2
 
 
 /* DMAE command defines */
@@ -1294,6 +1309,10 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 				 AEU_INPUTS_ATTN_BITS_IGU_PARITY_ERROR | \
 				 AEU_INPUTS_ATTN_BITS_MISC_PARITY_ERROR)
 
+#define HW_PRTY_ASSERT_SET_3 (AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY | \
+		AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY | \
+		AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY | \
+		AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
 
 #define MULTI_FLAGS(bp) \
 		(TSTORM_ETH_FUNCTION_COMMON_CONFIG_RSS_IPV4_CAPABILITY | \
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 63a17d6..bbaf4ac 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -764,6 +764,40 @@ static void bnx2x_int_disable_sync(struct bnx2x *bp, int disable_hw)
  * General service functions
  */
 
+/* Return true if succeeded to acquire the lock */
+static bool bnx2x_trylock_hw_lock(struct bnx2x *bp, u32 resource)
+{
+	u32 lock_status;
+	u32 resource_bit = (1 << resource);
+	int func = BP_FUNC(bp);
+	u32 hw_lock_control_reg;
+
+	DP(NETIF_MSG_HW, "Trying to take a lock on resource %d\n", resource);
+
+	/* Validating that the resource is within range */
+	if (resource > HW_LOCK_MAX_RESOURCE_VALUE) {
+		DP(NETIF_MSG_HW,
+		   "resource(0x%x) > HW_LOCK_MAX_RESOURCE_VALUE(0x%x)\n",
+		   resource, HW_LOCK_MAX_RESOURCE_VALUE);
+		return -EINVAL;
+	}
+
+	if (func <= 5)
+		hw_lock_control_reg = (MISC_REG_DRIVER_CONTROL_1 + func*8);
+	else
+		hw_lock_control_reg =
+				(MISC_REG_DRIVER_CONTROL_7 + (func - 6)*8);
+
+	/* Try to acquire the lock */
+	REG_WR(bp, hw_lock_control_reg + 4, resource_bit);
+	lock_status = REG_RD(bp, hw_lock_control_reg);
+	if (lock_status & resource_bit)
+		return true;
+
+	DP(NETIF_MSG_HW, "Failed to get a lock on resource %d\n", resource);
+	return false;
+}
+
 static inline void bnx2x_ack_sb(struct bnx2x *bp, u8 sb_id,
 				u8 storm, u16 index, u8 op, u8 update)
 {
@@ -1901,6 +1935,8 @@ static int bnx2x_release_hw_lock(struct bnx2x *bp, u32 resource)
 	int func = BP_FUNC(bp);
 	u32 hw_lock_control_reg;
 
+	DP(NETIF_MSG_HW, "Releasing a lock on resource %d\n", resource);
+
 	/* Validating that the resource is within range */
 	if (resource > HW_LOCK_MAX_RESOURCE_VALUE) {
 		DP(NETIF_MSG_HW,
@@ -2741,12 +2777,11 @@ static int bnx2x_sp_post(struct bnx2x *bp, int command, int cid,
 /* acquire split MCP access lock register */
 static int bnx2x_acquire_alr(struct bnx2x *bp)
 {
-	u32 i, j, val;
+	u32 j, val;
 	int rc = 0;
 
 	might_sleep();
-	i = 100;
-	for (j = 0; j < i*10; j++) {
+	for (j = 0; j < 1000; j++) {
 		val = (1UL << 31);
 		REG_WR(bp, GRCBASE_MCP + 0x9c, val);
 		val = REG_RD(bp, GRCBASE_MCP + 0x9c);
@@ -2766,9 +2801,7 @@ static int bnx2x_acquire_alr(struct bnx2x *bp)
 /* release split MCP access lock register */
 static void bnx2x_release_alr(struct bnx2x *bp)
 {
-	u32 val = 0;
-
-	REG_WR(bp, GRCBASE_MCP + 0x9c, val);
+	REG_WR(bp, GRCBASE_MCP + 0x9c, 0);
 }
 
 static inline u16 bnx2x_update_dsb_idx(struct bnx2x *bp)
@@ -2824,7 +2857,7 @@ static void bnx2x_attn_int_asserted(struct bnx2x *bp, u32 asserted)
 
 	DP(NETIF_MSG_HW, "aeu_mask %x  newly asserted %x\n",
 	   aeu_mask, asserted);
-	aeu_mask &= ~(asserted & 0xff);
+	aeu_mask &= ~(asserted & 0x3ff);
 	DP(NETIF_MSG_HW, "new mask %x\n", aeu_mask);
 
 	REG_WR(bp, aeu_addr, aeu_mask);
@@ -3105,10 +3138,311 @@ static inline void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
 	}
 }
 
-static void bnx2x_attn_int_deasserted(struct bnx2x *bp, u32 deasserted)
+static int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode);
+static int bnx2x_nic_load(struct bnx2x *bp, int load_mode);
+
+
+#define BNX2X_MISC_GEN_REG      MISC_REG_GENERIC_POR_1
+#define LOAD_COUNTER_BITS	16 /* Number of bits for load counter */
+#define LOAD_COUNTER_MASK	(((u32)0x1 << LOAD_COUNTER_BITS) - 1)
+#define RESET_DONE_FLAG_MASK	(~LOAD_COUNTER_MASK)
+#define RESET_DONE_FLAG_SHIFT	LOAD_COUNTER_BITS
+#define CHIP_PARITY_SUPPORTED(bp)   (CHIP_IS_E1(bp) || CHIP_IS_E1H(bp))
+/*
+ * should be run under rtnl lock
+ */
+static inline void bnx2x_set_reset_done(struct bnx2x *bp)
+{
+	u32 val	= REG_RD(bp, BNX2X_MISC_GEN_REG);
+	val &= ~(1 << RESET_DONE_FLAG_SHIFT);
+	REG_WR(bp, BNX2X_MISC_GEN_REG, val);
+	barrier();
+	mmiowb();
+}
+
+/*
+ * should be run under rtnl lock
+ */
+static inline void bnx2x_set_reset_in_progress(struct bnx2x *bp)
+{
+	u32 val	= REG_RD(bp, BNX2X_MISC_GEN_REG);
+	val |= (1 << 16);
+	REG_WR(bp, BNX2X_MISC_GEN_REG, val);
+	barrier();
+	mmiowb();
+}
+
+/*
+ * should be run under rtnl lock
+ */
+static inline bool bnx2x_reset_is_done(struct bnx2x *bp)
+{
+	u32 val	= REG_RD(bp, BNX2X_MISC_GEN_REG);
+	DP(NETIF_MSG_HW, "GEN_REG_VAL=0x%08x\n", val);
+	return (val & RESET_DONE_FLAG_MASK) ? false : true;
+}
+
+/*
+ * should be run under rtnl lock
+ */
+static inline void bnx2x_inc_load_cnt(struct bnx2x *bp)
+{
+	u32 val1, val = REG_RD(bp, BNX2X_MISC_GEN_REG);
+
+	DP(NETIF_MSG_HW, "Old GEN_REG_VAL=0x%08x\n", val);
+
+	val1 = ((val & LOAD_COUNTER_MASK) + 1) & LOAD_COUNTER_MASK;
+	REG_WR(bp, BNX2X_MISC_GEN_REG, (val & RESET_DONE_FLAG_MASK) | val1);
+	barrier();
+	mmiowb();
+}
+
+/*
+ * should be run under rtnl lock
+ */
+static inline u32 bnx2x_dec_load_cnt(struct bnx2x *bp)
+{
+	u32 val1, val = REG_RD(bp, BNX2X_MISC_GEN_REG);
+
+	DP(NETIF_MSG_HW, "Old GEN_REG_VAL=0x%08x\n", val);
+
+	val1 = ((val & LOAD_COUNTER_MASK) - 1) & LOAD_COUNTER_MASK;
+	REG_WR(bp, BNX2X_MISC_GEN_REG, (val & RESET_DONE_FLAG_MASK) | val1);
+	barrier();
+	mmiowb();
+
+	return val1;
+}
+
+/*
+ * should be run under rtnl lock
+ */
+static inline u32 bnx2x_get_load_cnt(struct bnx2x *bp)
+{
+	return REG_RD(bp, BNX2X_MISC_GEN_REG) & LOAD_COUNTER_MASK;
+}
+
+static inline void bnx2x_clear_load_cnt(struct bnx2x *bp)
+{
+	u32 val = REG_RD(bp, BNX2X_MISC_GEN_REG);
+	REG_WR(bp, BNX2X_MISC_GEN_REG, val & (~LOAD_COUNTER_MASK));
+}
+
+static inline void _print_next_block(int idx, const char *blk)
+{
+	if (idx)
+		pr_cont(", ");
+	pr_cont("%s", blk);
+}
+
+static inline int bnx2x_print_blocks_with_parity0(u32 sig, int par_num)
+{
+	int i = 0;
+	u32 cur_bit = 0;
+	for (i = 0; sig; i++) {
+		cur_bit = ((u32)0x1 << i);
+		if (sig & cur_bit) {
+			switch (cur_bit) {
+			case AEU_INPUTS_ATTN_BITS_BRB_PARITY_ERROR:
+				_print_next_block(par_num++, "BRB");
+				break;
+			case AEU_INPUTS_ATTN_BITS_PARSER_PARITY_ERROR:
+				_print_next_block(par_num++, "PARSER");
+				break;
+			case AEU_INPUTS_ATTN_BITS_TSDM_PARITY_ERROR:
+				_print_next_block(par_num++, "TSDM");
+				break;
+			case AEU_INPUTS_ATTN_BITS_SEARCHER_PARITY_ERROR:
+				_print_next_block(par_num++, "SEARCHER");
+				break;
+			case AEU_INPUTS_ATTN_BITS_TSEMI_PARITY_ERROR:
+				_print_next_block(par_num++, "TSEMI");
+				break;
+			}
+
+			/* Clear the bit */
+			sig &= ~cur_bit;
+		}
+	}
+
+	return par_num;
+}
+
+static inline int bnx2x_print_blocks_with_parity1(u32 sig, int par_num)
+{
+	int i = 0;
+	u32 cur_bit = 0;
+	for (i = 0; sig; i++) {
+		cur_bit = ((u32)0x1 << i);
+		if (sig & cur_bit) {
+			switch (cur_bit) {
+			case AEU_INPUTS_ATTN_BITS_PBCLIENT_PARITY_ERROR:
+				_print_next_block(par_num++, "PBCLIENT");
+				break;
+			case AEU_INPUTS_ATTN_BITS_QM_PARITY_ERROR:
+				_print_next_block(par_num++, "QM");
+				break;
+			case AEU_INPUTS_ATTN_BITS_XSDM_PARITY_ERROR:
+				_print_next_block(par_num++, "XSDM");
+				break;
+			case AEU_INPUTS_ATTN_BITS_XSEMI_PARITY_ERROR:
+				_print_next_block(par_num++, "XSEMI");
+				break;
+			case AEU_INPUTS_ATTN_BITS_DOORBELLQ_PARITY_ERROR:
+				_print_next_block(par_num++, "DOORBELLQ");
+				break;
+			case AEU_INPUTS_ATTN_BITS_VAUX_PCI_CORE_PARITY_ERROR:
+				_print_next_block(par_num++, "VAUX PCI CORE");
+				break;
+			case AEU_INPUTS_ATTN_BITS_DEBUG_PARITY_ERROR:
+				_print_next_block(par_num++, "DEBUG");
+				break;
+			case AEU_INPUTS_ATTN_BITS_USDM_PARITY_ERROR:
+				_print_next_block(par_num++, "USDM");
+				break;
+			case AEU_INPUTS_ATTN_BITS_USEMI_PARITY_ERROR:
+				_print_next_block(par_num++, "USEMI");
+				break;
+			case AEU_INPUTS_ATTN_BITS_UPB_PARITY_ERROR:
+				_print_next_block(par_num++, "UPB");
+				break;
+			case AEU_INPUTS_ATTN_BITS_CSDM_PARITY_ERROR:
+				_print_next_block(par_num++, "CSDM");
+				break;
+			}
+
+			/* Clear the bit */
+			sig &= ~cur_bit;
+		}
+	}
+
+	return par_num;
+}
+
+static inline int bnx2x_print_blocks_with_parity2(u32 sig, int par_num)
+{
+	int i = 0;
+	u32 cur_bit = 0;
+	for (i = 0; sig; i++) {
+		cur_bit = ((u32)0x1 << i);
+		if (sig & cur_bit) {
+			switch (cur_bit) {
+			case AEU_INPUTS_ATTN_BITS_CSEMI_PARITY_ERROR:
+				_print_next_block(par_num++, "CSEMI");
+				break;
+			case AEU_INPUTS_ATTN_BITS_PXP_PARITY_ERROR:
+				_print_next_block(par_num++, "PXP");
+				break;
+			case AEU_IN_ATTN_BITS_PXPPCICLOCKCLIENT_PARITY_ERROR:
+				_print_next_block(par_num++,
+					"PXPPCICLOCKCLIENT");
+				break;
+			case AEU_INPUTS_ATTN_BITS_CFC_PARITY_ERROR:
+				_print_next_block(par_num++, "CFC");
+				break;
+			case AEU_INPUTS_ATTN_BITS_CDU_PARITY_ERROR:
+				_print_next_block(par_num++, "CDU");
+				break;
+			case AEU_INPUTS_ATTN_BITS_IGU_PARITY_ERROR:
+				_print_next_block(par_num++, "IGU");
+				break;
+			case AEU_INPUTS_ATTN_BITS_MISC_PARITY_ERROR:
+				_print_next_block(par_num++, "MISC");
+				break;
+			}
+
+			/* Clear the bit */
+			sig &= ~cur_bit;
+		}
+	}
+
+	return par_num;
+}
+
+static inline int bnx2x_print_blocks_with_parity3(u32 sig, int par_num)
+{
+	int i = 0;
+	u32 cur_bit = 0;
+	for (i = 0; sig; i++) {
+		cur_bit = ((u32)0x1 << i);
+		if (sig & cur_bit) {
+			switch (cur_bit) {
+			case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY:
+				_print_next_block(par_num++, "MCP ROM");
+				break;
+			case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY:
+				_print_next_block(par_num++, "MCP UMP RX");
+				break;
+			case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY:
+				_print_next_block(par_num++, "MCP UMP TX");
+				break;
+			case AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY:
+				_print_next_block(par_num++, "MCP SCPAD");
+				break;
+			}
+
+			/* Clear the bit */
+			sig &= ~cur_bit;
+		}
+	}
+
+	return par_num;
+}
+
+static inline bool bnx2x_parity_attn(struct bnx2x *bp, u32 sig0, u32 sig1,
+				     u32 sig2, u32 sig3)
+{
+	if ((sig0 & HW_PRTY_ASSERT_SET_0) || (sig1 & HW_PRTY_ASSERT_SET_1) ||
+	    (sig2 & HW_PRTY_ASSERT_SET_2) || (sig3 & HW_PRTY_ASSERT_SET_3)) {
+		int par_num = 0;
+		DP(NETIF_MSG_HW, "Was parity error: HW block parity attention: "
+			"[0]:0x%08x [1]:0x%08x "
+			"[2]:0x%08x [3]:0x%08x\n",
+			  sig0 & HW_PRTY_ASSERT_SET_0,
+			  sig1 & HW_PRTY_ASSERT_SET_1,
+			  sig2 & HW_PRTY_ASSERT_SET_2,
+			  sig3 & HW_PRTY_ASSERT_SET_3);
+		printk(KERN_ERR"%s: Parity errors detected in blocks: ",
+		       bp->dev->name);
+		par_num = bnx2x_print_blocks_with_parity0(
+			sig0 & HW_PRTY_ASSERT_SET_0, par_num);
+		par_num = bnx2x_print_blocks_with_parity1(
+			sig1 & HW_PRTY_ASSERT_SET_1, par_num);
+		par_num = bnx2x_print_blocks_with_parity2(
+			sig2 & HW_PRTY_ASSERT_SET_2, par_num);
+		par_num = bnx2x_print_blocks_with_parity3(
+			sig3 & HW_PRTY_ASSERT_SET_3, par_num);
+		printk("\n");
+		return true;
+	} else
+		return false;
+}
+
+static bool bnx2x_chk_parity_attn(struct bnx2x *bp)
 {
 	struct attn_route attn;
-	struct attn_route group_mask;
+	int port = BP_PORT(bp);
+
+	attn.sig[0] = REG_RD(bp,
+		MISC_REG_AEU_AFTER_INVERT_1_FUNC_0 +
+			     port*4);
+	attn.sig[1] = REG_RD(bp,
+		MISC_REG_AEU_AFTER_INVERT_2_FUNC_0 +
+			     port*4);
+	attn.sig[2] = REG_RD(bp,
+		MISC_REG_AEU_AFTER_INVERT_3_FUNC_0 +
+			     port*4);
+	attn.sig[3] = REG_RD(bp,
+		MISC_REG_AEU_AFTER_INVERT_4_FUNC_0 +
+			     port*4);
+
+	return bnx2x_parity_attn(bp, attn.sig[0], attn.sig[1], attn.sig[2],
+					attn.sig[3]);
+}
+
+static void bnx2x_attn_int_deasserted(struct bnx2x *bp, u32 deasserted)
+{
+	struct attn_route attn, *group_mask;
 	int port = BP_PORT(bp);
 	int index;
 	u32 reg_addr;
@@ -3119,6 +3453,19 @@ static void bnx2x_attn_int_deasserted(struct bnx2x *bp, u32 deasserted)
 	   try to handle this event */
 	bnx2x_acquire_alr(bp);
 
+	if (bnx2x_chk_parity_attn(bp)) {
+		bp->recovery_state = BNX2X_RECOVERY_INIT;
+		bnx2x_set_reset_in_progress(bp);
+		schedule_delayed_work(&bp->reset_task, 0);
+		/* Disable HW interrupts */
+		bnx2x_int_disable(bp);
+		bnx2x_release_alr(bp);
+		/* In case of parity errors don't handle attentions so that
+		 * other function would "see" parity errors.
+		 */
+		return;
+	}
+
 	attn.sig[0] = REG_RD(bp, MISC_REG_AEU_AFTER_INVERT_1_FUNC_0 + port*4);
 	attn.sig[1] = REG_RD(bp, MISC_REG_AEU_AFTER_INVERT_2_FUNC_0 + port*4);
 	attn.sig[2] = REG_RD(bp, MISC_REG_AEU_AFTER_INVERT_3_FUNC_0 + port*4);
@@ -3128,28 +3475,20 @@ static void bnx2x_attn_int_deasserted(struct bnx2x *bp, u32 deasserted)
 
 	for (index = 0; index < MAX_DYNAMIC_ATTN_GRPS; index++) {
 		if (deasserted & (1 << index)) {
-			group_mask = bp->attn_group[index];
+			group_mask = &bp->attn_group[index];
 
 			DP(NETIF_MSG_HW, "group[%d]: %08x %08x %08x %08x\n",
-			   index, group_mask.sig[0], group_mask.sig[1],
-			   group_mask.sig[2], group_mask.sig[3]);
+			   index, group_mask->sig[0], group_mask->sig[1],
+			   group_mask->sig[2], group_mask->sig[3]);
 
 			bnx2x_attn_int_deasserted3(bp,
-					attn.sig[3] & group_mask.sig[3]);
+					attn.sig[3] & group_mask->sig[3]);
 			bnx2x_attn_int_deasserted1(bp,
-					attn.sig[1] & group_mask.sig[1]);
+					attn.sig[1] & group_mask->sig[1]);
 			bnx2x_attn_int_deasserted2(bp,
-					attn.sig[2] & group_mask.sig[2]);
+					attn.sig[2] & group_mask->sig[2]);
 			bnx2x_attn_int_deasserted0(bp,
-					attn.sig[0] & group_mask.sig[0]);
-
-			if ((attn.sig[0] & group_mask.sig[0] &
-						HW_PRTY_ASSERT_SET_0) ||
-			    (attn.sig[1] & group_mask.sig[1] &
-						HW_PRTY_ASSERT_SET_1) ||
-			    (attn.sig[2] & group_mask.sig[2] &
-						HW_PRTY_ASSERT_SET_2))
-				BNX2X_ERR("FATAL HW block parity attention\n");
+					attn.sig[0] & group_mask->sig[0]);
 		}
 	}
 
@@ -3173,7 +3512,7 @@ static void bnx2x_attn_int_deasserted(struct bnx2x *bp, u32 deasserted)
 
 	DP(NETIF_MSG_HW, "aeu_mask %x  newly deasserted %x\n",
 	   aeu_mask, deasserted);
-	aeu_mask |= (deasserted & 0xff);
+	aeu_mask |= (deasserted & 0x3ff);
 	DP(NETIF_MSG_HW, "new mask %x\n", aeu_mask);
 
 	REG_WR(bp, reg_addr, aeu_mask);
@@ -5963,6 +6302,50 @@ static void enable_blocks_attention(struct bnx2x *bp)
 	REG_WR(bp, PBF_REG_PBF_INT_MASK, 0X18);		/* bit 3,4 masked */
 }
 
+static const struct {
+	u32 addr;
+	u32 mask;
+} bnx2x_parity_mask[] = {
+	{PXP_REG_PXP_PRTY_MASK, 0xffffffff},
+	{PXP2_REG_PXP2_PRTY_MASK_0, 0xffffffff},
+	{PXP2_REG_PXP2_PRTY_MASK_1, 0xffffffff},
+	{HC_REG_HC_PRTY_MASK, 0xffffffff},
+	{MISC_REG_MISC_PRTY_MASK, 0xffffffff},
+	{QM_REG_QM_PRTY_MASK, 0x0},
+	{DORQ_REG_DORQ_PRTY_MASK, 0x0},
+	{GRCBASE_UPB + PB_REG_PB_PRTY_MASK, 0x0},
+	{GRCBASE_XPB + PB_REG_PB_PRTY_MASK, 0x0},
+	{SRC_REG_SRC_PRTY_MASK, 0x4}, /* bit 2 */
+	{CDU_REG_CDU_PRTY_MASK, 0x0},
+	{CFC_REG_CFC_PRTY_MASK, 0x0},
+	{DBG_REG_DBG_PRTY_MASK, 0x0},
+	{DMAE_REG_DMAE_PRTY_MASK, 0x0},
+	{BRB1_REG_BRB1_PRTY_MASK, 0x0},
+	{PRS_REG_PRS_PRTY_MASK, (1<<6)},/* bit 6 */
+	{TSDM_REG_TSDM_PRTY_MASK, 0x18},/* bit 3,4 */
+	{CSDM_REG_CSDM_PRTY_MASK, 0x8},	/* bit 3 */
+	{USDM_REG_USDM_PRTY_MASK, 0x38},/* bit 3,4,5 */
+	{XSDM_REG_XSDM_PRTY_MASK, 0x8},	/* bit 3 */
+	{TSEM_REG_TSEM_PRTY_MASK_0, 0x0},
+	{TSEM_REG_TSEM_PRTY_MASK_1, 0x0},
+	{USEM_REG_USEM_PRTY_MASK_0, 0x0},
+	{USEM_REG_USEM_PRTY_MASK_1, 0x0},
+	{CSEM_REG_CSEM_PRTY_MASK_0, 0x0},
+	{CSEM_REG_CSEM_PRTY_MASK_1, 0x0},
+	{XSEM_REG_XSEM_PRTY_MASK_0, 0x0},
+	{XSEM_REG_XSEM_PRTY_MASK_1, 0x0}
+};
+
+static void enable_blocks_parity(struct bnx2x *bp)
+{
+	int i, mask_arr_len =
+		sizeof(bnx2x_parity_mask)/(sizeof(bnx2x_parity_mask[0]));
+
+	for (i = 0; i < mask_arr_len; i++)
+		REG_WR(bp, bnx2x_parity_mask[i].addr,
+			bnx2x_parity_mask[i].mask);
+}
+
 
 static void bnx2x_reset_common(struct bnx2x *bp)
 {
@@ -6306,6 +6689,8 @@ static int bnx2x_init_common(struct bnx2x *bp)
 	REG_RD(bp, PXP2_REG_PXP2_INT_STS_CLR_0);
 
 	enable_blocks_attention(bp);
+	if (CHIP_PARITY_SUPPORTED(bp))
+		enable_blocks_parity(bp);
 
 	if (!BP_NOMCP(bp)) {
 		bnx2x_acquire_phy_lock(bp);
@@ -7657,6 +8042,7 @@ static int bnx2x_nic_load(struct bnx2x *bp, int load_mode)
 	if (bp->state == BNX2X_STATE_OPEN)
 		bnx2x_cnic_notify(bp, CNIC_CTL_START_CMD);
 #endif
+	bnx2x_inc_load_cnt(bp);
 
 	return 0;
 
@@ -7844,33 +8230,12 @@ static void bnx2x_reset_chip(struct bnx2x *bp, u32 reset_code)
 	}
 }
 
-/* must be called with rtnl_lock */
-static int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode)
+static void bnx2x_chip_cleanup(struct bnx2x *bp, int unload_mode)
 {
 	int port = BP_PORT(bp);
 	u32 reset_code = 0;
 	int i, cnt, rc;
 
-#ifdef BCM_CNIC
-	bnx2x_cnic_notify(bp, CNIC_CTL_STOP_CMD);
-#endif
-	bp->state = BNX2X_STATE_CLOSING_WAIT4_HALT;
-
-	/* Set "drop all" */
-	bp->rx_mode = BNX2X_RX_MODE_NONE;
-	bnx2x_set_storm_rx_mode(bp);
-
-	/* Disable HW interrupts, NAPI and Tx */
-	bnx2x_netif_stop(bp, 1);
-
-	del_timer_sync(&bp->timer);
-	SHMEM_WR(bp, func_mb[BP_FUNC(bp)].drv_pulse_mb,
-		 (DRV_PULSE_ALWAYS_ALIVE | bp->fw_drv_pulse_wr_seq));
-	bnx2x_stats_handle(bp, STATS_EVENT_STOP);
-
-	/* Release IRQs */
-	bnx2x_free_irq(bp, false);
-
 	/* Wait until tx fastpath tasks complete */
 	for_each_queue(bp, i) {
 		struct bnx2x_fastpath *fp = &bp->fp[i];
@@ -8011,6 +8376,69 @@ unload_error:
 	if (!BP_NOMCP(bp))
 		bnx2x_fw_command(bp, DRV_MSG_CODE_UNLOAD_DONE);
 
+}
+
+static inline void bnx2x_disable_close_the_gate(struct bnx2x *bp)
+{
+	u32 val;
+
+	DP(NETIF_MSG_HW, "Disabling \"close the gates\"\n");
+
+	if (CHIP_IS_E1(bp)) {
+		int port = BP_PORT(bp);
+		u32 addr = port ? MISC_REG_AEU_MASK_ATTN_FUNC_1 :
+			MISC_REG_AEU_MASK_ATTN_FUNC_0;
+
+		val = REG_RD(bp, addr);
+		val &= ~(0x300);
+		REG_WR(bp, addr, val);
+	} else if (CHIP_IS_E1H(bp)) {
+		val = REG_RD(bp, MISC_REG_AEU_GENERAL_MASK);
+		val &= ~(MISC_AEU_GENERAL_MASK_REG_AEU_PXP_CLOSE_MASK |
+			 MISC_AEU_GENERAL_MASK_REG_AEU_NIG_CLOSE_MASK);
+		REG_WR(bp, MISC_REG_AEU_GENERAL_MASK, val);
+	}
+}
+
+/* must be called with rtnl_lock */
+static int bnx2x_nic_unload(struct bnx2x *bp, int unload_mode)
+{
+	int i;
+
+	if (bp->state == BNX2X_STATE_CLOSED) {
+		/* Interface has been removed - nothing to recover */
+		bp->recovery_state = BNX2X_RECOVERY_DONE;
+		bp->is_leader = 0;
+		bnx2x_release_hw_lock(bp, HW_LOCK_RESOURCE_RESERVED_08);
+		smp_wmb();
+
+		return -EINVAL;
+	}
+
+#ifdef BCM_CNIC
+	bnx2x_cnic_notify(bp, CNIC_CTL_STOP_CMD);
+#endif
+	bp->state = BNX2X_STATE_CLOSING_WAIT4_HALT;
+
+	/* Set "drop all" */
+	bp->rx_mode = BNX2X_RX_MODE_NONE;
+	bnx2x_set_storm_rx_mode(bp);
+
+	/* Disable HW interrupts, NAPI and Tx */
+	bnx2x_netif_stop(bp, 1);
+
+	del_timer_sync(&bp->timer);
+	SHMEM_WR(bp, func_mb[BP_FUNC(bp)].drv_pulse_mb,
+		 (DRV_PULSE_ALWAYS_ALIVE | bp->fw_drv_pulse_wr_seq));
+	bnx2x_stats_handle(bp, STATS_EVENT_STOP);
+
+	/* Release IRQs */
+	bnx2x_free_irq(bp, false);
+
+	/* Cleanup the chip if needed */
+	if (unload_mode != UNLOAD_RECOVERY)
+		bnx2x_chip_cleanup(bp, unload_mode);
+
 	bp->port.pmf = 0;
 
 	/* Free SKBs, SGEs, TPA pool and driver internals */
@@ -8025,17 +8453,448 @@ unload_error:
 
 	netif_carrier_off(bp->dev);
 
+	/* The last driver must disable a "close the gate" if there is no
+	 * parity attention or "process kill" pending.
+	 */
+	if ((!bnx2x_dec_load_cnt(bp)) && (!bnx2x_chk_parity_attn(bp)) &&
+	    bnx2x_reset_is_done(bp))
+		bnx2x_disable_close_the_gate(bp);
+
+	/* Reset MCP mail box sequence if there is on going recovery */
+	if (unload_mode == UNLOAD_RECOVERY)
+		bp->fw_seq = 0;
+
+	return 0;
+}
+
+/* Close gates #2, #3 and #4: */
+static void bnx2x_set_234_gates(struct bnx2x *bp, bool close)
+{
+	u32 val, addr;
+
+	/* Gates #2 and #4a are closed/opened for "not E1" only */
+	if (!CHIP_IS_E1(bp)) {
+		/* #4 */
+		val = REG_RD(bp, PXP_REG_HST_DISCARD_DOORBELLS);
+		REG_WR(bp, PXP_REG_HST_DISCARD_DOORBELLS,
+		       close ? (val | 0x1) : (val & (~(u32)1)));
+		/* #2 */
+		val = REG_RD(bp, PXP_REG_HST_DISCARD_INTERNAL_WRITES);
+		REG_WR(bp, PXP_REG_HST_DISCARD_INTERNAL_WRITES,
+		       close ? (val | 0x1) : (val & (~(u32)1)));
+	}
+
+	/* #3 */
+	addr = BP_PORT(bp) ? HC_REG_CONFIG_1 : HC_REG_CONFIG_0;
+	val = REG_RD(bp, addr);
+	REG_WR(bp, addr, (!close) ? (val | 0x1) : (val & (~(u32)1)));
+
+	DP(NETIF_MSG_HW, "%s gates #2, #3 and #4\n",
+		close ? "closing" : "opening");
+	mmiowb();
+}
+
+#define SHARED_MF_CLP_MAGIC  0x80000000 /* `magic' bit */
+
+static void bnx2x_clp_reset_prep(struct bnx2x *bp, u32 *magic_val)
+{
+	/* Do some magic... */
+	u32 val = MF_CFG_RD(bp, shared_mf_config.clp_mb);
+	*magic_val = val & SHARED_MF_CLP_MAGIC;
+	MF_CFG_WR(bp, shared_mf_config.clp_mb, val | SHARED_MF_CLP_MAGIC);
+}
+
+/* Restore the value of the `magic' bit.
+ *
+ * @param pdev Device handle.
+ * @param magic_val Old value of the `magic' bit.
+ */
+static void bnx2x_clp_reset_done(struct bnx2x *bp, u32 magic_val)
+{
+	/* Restore the `magic' bit value... */
+	/* u32 val = SHMEM_RD(bp, mf_cfg.shared_mf_config.clp_mb);
+	SHMEM_WR(bp, mf_cfg.shared_mf_config.clp_mb,
+		(val & (~SHARED_MF_CLP_MAGIC)) | magic_val); */
+	u32 val = MF_CFG_RD(bp, shared_mf_config.clp_mb);
+	MF_CFG_WR(bp, shared_mf_config.clp_mb,
+		(val & (~SHARED_MF_CLP_MAGIC)) | magic_val);
+}
+
+/* Prepares for MCP reset: takes care of CLP configurations.
+ *
+ * @param bp
+ * @param magic_val Old value of 'magic' bit.
+ */
+static void bnx2x_reset_mcp_prep(struct bnx2x *bp, u32 *magic_val)
+{
+	u32 shmem;
+	u32 validity_offset;
+
+	DP(NETIF_MSG_HW, "Starting\n");
+
+	/* Set `magic' bit in order to save MF config */
+	if (!CHIP_IS_E1(bp))
+		bnx2x_clp_reset_prep(bp, magic_val);
+
+	/* Get shmem offset */
+	shmem = REG_RD(bp, MISC_REG_SHARED_MEM_ADDR);
+	validity_offset = offsetof(struct shmem_region, validity_map[0]);
+
+	/* Clear validity map flags */
+	if (shmem > 0)
+		REG_WR(bp, shmem + validity_offset, 0);
+}
+
+#define MCP_TIMEOUT      5000   /* 5 seconds (in ms) */
+#define MCP_ONE_TIMEOUT  100    /* 100 ms */
+
+/* Waits for MCP_ONE_TIMEOUT or MCP_ONE_TIMEOUT*10,
+ * depending on the HW type.
+ *
+ * @param bp
+ */
+static inline void bnx2x_mcp_wait_one(struct bnx2x *bp)
+{
+	/* special handling for emulation and FPGA,
+	   wait 10 times longer */
+	if (CHIP_REV_IS_SLOW(bp))
+		msleep(MCP_ONE_TIMEOUT*10);
+	else
+		msleep(MCP_ONE_TIMEOUT);
+}
+
+static int bnx2x_reset_mcp_comp(struct bnx2x *bp, u32 magic_val)
+{
+	u32 shmem, cnt, validity_offset, val;
+	int rc = 0;
+
+	msleep(100);
+
+	/* Get shmem offset */
+	shmem = REG_RD(bp, MISC_REG_SHARED_MEM_ADDR);
+	if (shmem == 0) {
+		BNX2X_ERR("Shmem 0 return failure\n");
+		rc = -ENOTTY;
+		goto exit_lbl;
+	}
+
+	validity_offset = offsetof(struct shmem_region, validity_map[0]);
+
+	/* Wait for MCP to come up */
+	for (cnt = 0; cnt < (MCP_TIMEOUT / MCP_ONE_TIMEOUT); cnt++) {
+		/* TBD: its best to check validity map of last port.
+		 * currently checks on port 0.
+		 */
+		val = REG_RD(bp, shmem + validity_offset);
+		DP(NETIF_MSG_HW, "shmem 0x%x validity map(0x%x)=0x%x\n", shmem,
+		   shmem + validity_offset, val);
+
+		/* check that shared memory is valid. */
+		if ((val & (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB))
+		    == (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB))
+			break;
+
+		bnx2x_mcp_wait_one(bp);
+	}
+
+	DP(NETIF_MSG_HW, "Cnt=%d Shmem validity map 0x%x\n", cnt, val);
+
+	/* Check that shared memory is valid. This indicates that MCP is up. */
+	if ((val & (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB)) !=
+	    (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB)) {
+		BNX2X_ERR("Shmem signature not present. MCP is not up !!\n");
+		rc = -ENOTTY;
+		goto exit_lbl;
+	}
+
+exit_lbl:
+	/* Restore the `magic' bit value */
+	if (!CHIP_IS_E1(bp))
+		bnx2x_clp_reset_done(bp, magic_val);
+
+	return rc;
+}
+
+static void bnx2x_pxp_prep(struct bnx2x *bp)
+{
+	if (!CHIP_IS_E1(bp)) {
+		REG_WR(bp, PXP2_REG_RD_START_INIT, 0);
+		REG_WR(bp, PXP2_REG_RQ_RBC_DONE, 0);
+		REG_WR(bp, PXP2_REG_RQ_CFG_DONE, 0);
+		mmiowb();
+	}
+}
+
+/*
+ * Reset the whole chip except for:
+ *      - PCIE core
+ *      - PCI Glue, PSWHST, PXP/PXP2 RF (all controlled by
+ *              one reset bit)
+ *      - IGU
+ *      - MISC (including AEU)
+ *      - GRC
+ *      - RBCN, RBCP
+ */
+static void bnx2x_process_kill_chip_reset(struct bnx2x *bp)
+{
+	u32 not_reset_mask1, reset_mask1, not_reset_mask2, reset_mask2;
+
+	not_reset_mask1 =
+		MISC_REGISTERS_RESET_REG_1_RST_HC |
+		MISC_REGISTERS_RESET_REG_1_RST_PXPV |
+		MISC_REGISTERS_RESET_REG_1_RST_PXP;
+
+	not_reset_mask2 =
+		MISC_REGISTERS_RESET_REG_2_RST_MDIO |
+		MISC_REGISTERS_RESET_REG_2_RST_EMAC0_HARD_CORE |
+		MISC_REGISTERS_RESET_REG_2_RST_EMAC1_HARD_CORE |
+		MISC_REGISTERS_RESET_REG_2_RST_MISC_CORE |
+		MISC_REGISTERS_RESET_REG_2_RST_RBCN |
+		MISC_REGISTERS_RESET_REG_2_RST_GRC  |
+		MISC_REGISTERS_RESET_REG_2_RST_MCP_N_RESET_REG_HARD_CORE |
+		MISC_REGISTERS_RESET_REG_2_RST_MCP_N_HARD_CORE_RST_B;
+
+	reset_mask1 = 0xffffffff;
+
+	if (CHIP_IS_E1(bp))
+		reset_mask2 = 0xffff;
+	else
+		reset_mask2 = 0x1ffff;
+
+	REG_WR(bp, GRCBASE_MISC + MISC_REGISTERS_RESET_REG_1_CLEAR,
+	       reset_mask1 & (~not_reset_mask1));
+	REG_WR(bp, GRCBASE_MISC + MISC_REGISTERS_RESET_REG_2_CLEAR,
+	       reset_mask2 & (~not_reset_mask2));
+
+	barrier();
+	mmiowb();
+
+	REG_WR(bp, GRCBASE_MISC + MISC_REGISTERS_RESET_REG_1_SET, reset_mask1);
+	REG_WR(bp, GRCBASE_MISC + MISC_REGISTERS_RESET_REG_2_SET, reset_mask2);
+	mmiowb();
+}
+
+static int bnx2x_process_kill(struct bnx2x *bp)
+{
+	int cnt = 1000;
+	u32 val = 0;
+	u32 sr_cnt, blk_cnt, port_is_idle_0, port_is_idle_1, pgl_exp_rom2;
+
+
+	/* Empty the Tetris buffer, wait for 1s */
+	do {
+		sr_cnt  = REG_RD(bp, PXP2_REG_RD_SR_CNT);
+		blk_cnt = REG_RD(bp, PXP2_REG_RD_BLK_CNT);
+		port_is_idle_0 = REG_RD(bp, PXP2_REG_RD_PORT_IS_IDLE_0);
+		port_is_idle_1 = REG_RD(bp, PXP2_REG_RD_PORT_IS_IDLE_1);
+		pgl_exp_rom2 = REG_RD(bp, PXP2_REG_PGL_EXP_ROM2);
+		if ((sr_cnt == 0x7e) && (blk_cnt == 0xa0) &&
+		    ((port_is_idle_0 & 0x1) == 0x1) &&
+		    ((port_is_idle_1 & 0x1) == 0x1) &&
+		    (pgl_exp_rom2 == 0xffffffff))
+			break;
+		msleep(1);
+	} while (cnt-- > 0);
+
+	if (cnt <= 0) {
+		DP(NETIF_MSG_HW, "Tetris buffer didn't get empty or there"
+			  " are still"
+			  " outstanding read requests after 1s!\n");
+		DP(NETIF_MSG_HW, "sr_cnt=0x%08x, blk_cnt=0x%08x,"
+			  " port_is_idle_0=0x%08x,"
+			  " port_is_idle_1=0x%08x, pgl_exp_rom2=0x%08x\n",
+			  sr_cnt, blk_cnt, port_is_idle_0, port_is_idle_1,
+			  pgl_exp_rom2);
+		return -EAGAIN;
+	}
+
+	barrier();
+
+	/* Close gates #2, #3 and #4 */
+	bnx2x_set_234_gates(bp, true);
+
+	/* TBD: Indicate that "process kill" is in progress to MCP */
+
+	/* Clear "unprepared" bit */
+	REG_WR(bp, MISC_REG_UNPREPARED, 0);
+	barrier();
+
+	/* Make sure all is written to the chip before the reset */
+	mmiowb();
+
+	/* Wait for 1ms to empty GLUE and PCI-E core queues,
+	 * PSWHST, GRC and PSWRD Tetris buffer.
+	 */
+	msleep(1);
+
+	/* Prepare to chip reset: */
+	/* MCP */
+	bnx2x_reset_mcp_prep(bp, &val);
+
+	/* PXP */
+	bnx2x_pxp_prep(bp);
+	barrier();
+
+	/* reset the chip */
+	bnx2x_process_kill_chip_reset(bp);
+	barrier();
+
+	/* Recover after reset: */
+	/* MCP */
+	if (bnx2x_reset_mcp_comp(bp, val))
+		return -EAGAIN;
+
+	/* PXP */
+	bnx2x_pxp_prep(bp);
+
+	/* Open the gates #2, #3 and #4 */
+	bnx2x_set_234_gates(bp, false);
+
+	/* TBD: IGU/AEU preparation bring back the AEU/IGU to a
+	 * reset state, re-enable attentions. */
+
 	return 0;
 }
 
+static int bnx2x_leader_reset(struct bnx2x *bp)
+{
+	int rc = 0;
+	/* Try to recover after the failure */
+	if (bnx2x_process_kill(bp)) {
+		printk(KERN_ERR "%s: Something bad had happen! Aii!\n",
+		       bp->dev->name);
+		rc = -EAGAIN;
+		goto exit_leader_reset;
+	}
+
+	/* Clear "reset is in progress" bit and update the driver state */
+	bnx2x_set_reset_done(bp);
+	bp->recovery_state = BNX2X_RECOVERY_DONE;
+
+exit_leader_reset:
+	bp->is_leader = 0;
+	bnx2x_release_hw_lock(bp, HW_LOCK_RESOURCE_RESERVED_08);
+	smp_wmb();
+	return rc;
+}
+
+static int bnx2x_set_power_state(struct bnx2x *bp, pci_power_t state);
+
+/* Assumption: runs under rtnl lock. This together with the fact
+ * that it's called only from bnx2x_reset_task() ensure that it
+ * will never be called when netif_running(bp->dev) is false.
+ */
+static void bnx2x_parity_recover(struct bnx2x *bp)
+{
+	DP(NETIF_MSG_HW, "Handling parity\n");
+	while (1) {
+		switch (bp->recovery_state) {
+		case BNX2X_RECOVERY_INIT:
+			DP(NETIF_MSG_HW, "State is BNX2X_RECOVERY_INIT\n");
+			/* Try to get a LEADER_LOCK HW lock */
+			if (bnx2x_trylock_hw_lock(bp,
+				HW_LOCK_RESOURCE_RESERVED_08))
+				bp->is_leader = 1;
+
+			/* Stop the driver */
+			/* If interface has been removed - break */
+			if (bnx2x_nic_unload(bp, UNLOAD_RECOVERY))
+				return;
+
+			bp->recovery_state = BNX2X_RECOVERY_WAIT;
+			/* Ensure "is_leader" and "recovery_state"
+			 *  update values are seen on other CPUs
+			 */
+			smp_wmb();
+			break;
+
+		case BNX2X_RECOVERY_WAIT:
+			DP(NETIF_MSG_HW, "State is BNX2X_RECOVERY_WAIT\n");
+			if (bp->is_leader) {
+				u32 load_counter = bnx2x_get_load_cnt(bp);
+				if (load_counter) {
+					/* Wait until all other functions get
+					 * down.
+					 */
+					schedule_delayed_work(&bp->reset_task,
+								HZ/10);
+					return;
+				} else {
+					/* If all other functions got down -
+					 * try to bring the chip back to
+					 * normal. In any case it's an exit
+					 * point for a leader.
+					 */
+					if (bnx2x_leader_reset(bp) ||
+					bnx2x_nic_load(bp, LOAD_NORMAL)) {
+						printk(KERN_ERR"%s: Recovery "
+						"has failed. Power cycle is "
+						"needed.\n", bp->dev->name);
+						/* Disconnect this device */
+						netif_device_detach(bp->dev);
+						/* Block ifup for all function
+						 * of this ASIC until
+						 * "process kill" or power
+						 * cycle.
+						 */
+						bnx2x_set_reset_in_progress(bp);
+						/* Shut down the power */
+						bnx2x_set_power_state(bp,
+								PCI_D3hot);
+						return;
+					}
+
+					return;
+				}
+			} else { /* non-leader */
+				if (!bnx2x_reset_is_done(bp)) {
+					/* Try to get a LEADER_LOCK HW lock as
+					 * long as a former leader may have
+					 * been unloaded by the user or
+					 * released a leadership by another
+					 * reason.
+					 */
+					if (bnx2x_trylock_hw_lock(bp,
+					    HW_LOCK_RESOURCE_RESERVED_08)) {
+						/* I'm a leader now! Restart a
+						 * switch case.
+						 */
+						bp->is_leader = 1;
+						break;
+					}
+
+					schedule_delayed_work(&bp->reset_task,
+								HZ/10);
+					return;
+
+				} else { /* A leader has completed
+					  * the "process kill". It's an exit
+					  * point for a non-leader.
+					  */
+					bnx2x_nic_load(bp, LOAD_NORMAL);
+					bp->recovery_state =
+						BNX2X_RECOVERY_DONE;
+					smp_wmb();
+					return;
+				}
+			}
+		default:
+			return;
+		}
+	}
+}
+
+/* bnx2x_nic_unload() flushes the bnx2x_wq, thus reset task is
+ * scheduled on a general queue in order to prevent a dead lock.
+ */
 static void bnx2x_reset_task(struct work_struct *work)
 {
-	struct bnx2x *bp = container_of(work, struct bnx2x, reset_task);
+	struct bnx2x *bp = container_of(work, struct bnx2x, reset_task.work);
 
 #ifdef BNX2X_STOP_ON_ERROR
 	BNX2X_ERR("reset task called but STOP_ON_ERROR defined"
 		  " so reset not done to allow debug dump,\n"
-		  " you will need to reboot when done\n");
+	 KERN_ERR " you will need to reboot when done\n");
 	return;
 #endif
 
@@ -8044,8 +8903,12 @@ static void bnx2x_reset_task(struct work_struct *work)
 	if (!netif_running(bp->dev))
 		goto reset_task_exit;
 
-	bnx2x_nic_unload(bp, UNLOAD_NORMAL);
-	bnx2x_nic_load(bp, LOAD_NORMAL);
+	if (unlikely(bp->recovery_state != BNX2X_RECOVERY_DONE))
+		bnx2x_parity_recover(bp);
+	else {
+		bnx2x_nic_unload(bp, UNLOAD_NORMAL);
+		bnx2x_nic_load(bp, LOAD_NORMAL);
+	}
 
 reset_task_exit:
 	rtnl_unlock();
@@ -8913,7 +9776,7 @@ static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 #endif
 
 	INIT_DELAYED_WORK(&bp->sp_task, bnx2x_sp_task);
-	INIT_WORK(&bp->reset_task, bnx2x_reset_task);
+	INIT_DELAYED_WORK(&bp->reset_task, bnx2x_reset_task);
 
 	rc = bnx2x_get_hwinfo(bp);
 
@@ -9888,6 +10751,11 @@ static int bnx2x_set_ringparam(struct net_device *dev,
 	struct bnx2x *bp = netdev_priv(dev);
 	int rc = 0;
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return -EAGAIN;
+	}
+
 	if ((ering->rx_pending > MAX_RX_AVAIL) ||
 	    (ering->tx_pending > MAX_TX_AVAIL) ||
 	    (ering->tx_pending <= MAX_SKB_FRAGS + 4))
@@ -9973,6 +10841,11 @@ static int bnx2x_set_flags(struct net_device *dev, u32 data)
 	int changed = 0;
 	int rc = 0;
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return -EAGAIN;
+	}
+
 	/* TPA requires Rx CSUM offloading */
 	if ((data & ETH_FLAG_LRO) && bp->rx_csum) {
 		if (!disable_tpa) {
@@ -10009,6 +10882,11 @@ static int bnx2x_set_rx_csum(struct net_device *dev, u32 data)
 	struct bnx2x *bp = netdev_priv(dev);
 	int rc = 0;
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return -EAGAIN;
+	}
+
 	bp->rx_csum = data;
 
 	/* Disable TPA, when Rx CSUM is disabled. Otherwise all
@@ -10471,6 +11349,12 @@ static void bnx2x_self_test(struct net_device *dev,
 {
 	struct bnx2x *bp = netdev_priv(dev);
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		etest->flags |= ETH_TEST_FL_FAILED;
+		return;
+	}
+
 	memset(buf, 0, sizeof(u64) * BNX2X_NUM_TESTS);
 
 	if (!netif_running(dev))
@@ -11456,6 +12340,40 @@ static int bnx2x_open(struct net_device *dev)
 
 	bnx2x_set_power_state(bp, PCI_D0);
 
+	if (!bnx2x_reset_is_done(bp)) {
+		do {
+			/* Reset MCP mail box sequence if there is on going
+			 * recovery
+			 */
+			bp->fw_seq = 0;
+
+			/* If it's the first function to load and reset done
+			 * is still not cleared it may mean that. We don't
+			 * check the attention state here because it may have
+			 * already been cleared by a "common" reset but we
+			 * shell proceed with "process kill" anyway.
+			 */
+			if ((bnx2x_get_load_cnt(bp) == 0) &&
+				bnx2x_trylock_hw_lock(bp,
+				HW_LOCK_RESOURCE_RESERVED_08) &&
+				(!bnx2x_leader_reset(bp))) {
+				DP(NETIF_MSG_HW, "Recovered in open\n");
+				break;
+			}
+
+			bnx2x_set_power_state(bp, PCI_D3hot);
+
+			printk(KERN_ERR"%s: Recovery flow hasn't been properly"
+			" completed yet. Try again later. If u still see this"
+			" message after a few retries then power cycle is"
+			" required.\n", bp->dev->name);
+
+			return -EAGAIN;
+		} while (0);
+	}
+
+	bp->recovery_state = BNX2X_RECOVERY_DONE;
+
 	return bnx2x_nic_load(bp, LOAD_OPEN);
 }
 
@@ -11694,6 +12612,11 @@ static int bnx2x_change_mtu(struct net_device *dev, int new_mtu)
 	struct bnx2x *bp = netdev_priv(dev);
 	int rc = 0;
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return -EAGAIN;
+	}
+
 	if ((new_mtu > ETH_MAX_JUMBO_PACKET_SIZE) ||
 	    ((new_mtu + ETH_HLEN) < ETH_MIN_PACKET_SIZE))
 		return -EINVAL;
@@ -11721,7 +12644,7 @@ static void bnx2x_tx_timeout(struct net_device *dev)
 		bnx2x_panic();
 #endif
 	/* This allows the netif to be shutdown gracefully before resetting */
-	schedule_work(&bp->reset_task);
+	schedule_delayed_work(&bp->reset_task, 0);
 }
 
 #ifdef BCM_VLAN
@@ -11880,6 +12803,9 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 	REG_WR(bp, PXP2_REG_PGL_ADDR_90_F0 + BP_PORT(bp)*16, 0);
 	REG_WR(bp, PXP2_REG_PGL_ADDR_94_F0 + BP_PORT(bp)*16, 0);
 
+	/* Reset the load counter */
+	bnx2x_clear_load_cnt(bp);
+
 	dev->watchdog_timeo = TX_TIMEOUT;
 
 	dev->netdev_ops = &bnx2x_netdev_ops;
@@ -12205,6 +13131,9 @@ static void __devexit bnx2x_remove_one(struct pci_dev *pdev)
 
 	unregister_netdev(dev);
 
+	/* Make sure RESET task is not scheduled before continuing */
+	cancel_delayed_work_sync(&bp->reset_task);
+
 	kfree(bp->init_ops_offsets);
 	kfree(bp->init_ops);
 	kfree(bp->init_data);
@@ -12268,6 +13197,11 @@ static int bnx2x_resume(struct pci_dev *pdev)
 	}
 	bp = netdev_priv(dev);
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return -EAGAIN;
+	}
+
 	rtnl_lock();
 
 	pci_restore_state(pdev);
@@ -12434,6 +13368,11 @@ static void bnx2x_io_resume(struct pci_dev *pdev)
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct bnx2x *bp = netdev_priv(dev);
 
+	if (bp->recovery_state != BNX2X_RECOVERY_DONE) {
+		printk(KERN_ERR "Handling parity error recovery. Try again later\n");
+		return;
+	}
+
 	rtnl_lock();
 
 	bnx2x_eeh_recover(bp);
diff --git a/drivers/net/bnx2x_reg.h b/drivers/net/bnx2x_reg.h
index 944964e..a1f3bf0 100644
--- a/drivers/net/bnx2x_reg.h
+++ b/drivers/net/bnx2x_reg.h
@@ -766,6 +766,8 @@
 #define MCP_REG_MCPR_NVM_SW_ARB 				 0x86420
 #define MCP_REG_MCPR_NVM_WRITE					 0x86408
 #define MCP_REG_MCPR_SCRATCH					 0xa0000
+#define MISC_AEU_GENERAL_MASK_REG_AEU_NIG_CLOSE_MASK		 (0x1<<1)
+#define MISC_AEU_GENERAL_MASK_REG_AEU_PXP_CLOSE_MASK		 (0x1<<0)
 /* [R 32] read first 32 bit after inversion of function 0. mapped as
    follows: [0] NIG attention for function0; [1] NIG attention for
    function1; [2] GPIO1 mcp; [3] GPIO2 mcp; [4] GPIO3 mcp; [5] GPIO4 mcp;
@@ -1249,6 +1251,8 @@
 #define MISC_REG_E1HMF_MODE					 0xa5f8
 /* [RW 32] Debug only: spare RW register reset by core reset */
 #define MISC_REG_GENERIC_CR_0					 0xa460
+/* [RW 32] Debug only: spare RW register reset by por reset */
+#define MISC_REG_GENERIC_POR_1					 0xa474
 /* [RW 32] GPIO. [31-28] FLOAT port 0; [27-24] FLOAT port 0; When any of
    these bits is written as a '1'; the corresponding SPIO bit will turn off
    it's drivers and become an input. This is the reset state of all GPIO
@@ -1438,7 +1442,7 @@
    (~misc_registers_sw_timer_cfg_4.sw_timer_cfg_4[1] ) is set */
 #define MISC_REG_SW_TIMER_RELOAD_VAL_4				 0xa2fc
 /* [RW 32] the value of the counter for sw timers1-8. there are 8 addresses
-   in this register. addres 0 - timer 1; address - timer 2�address 7 -
+   in this register. addres 0 - timer 1; address 1 - timer 2, ...  address 7 -
    timer 8 */
 #define MISC_REG_SW_TIMER_VAL					 0xa5c0
 /* [RW 1] Set by the MCP to remember if one or more of the drivers is/are
@@ -2407,10 +2411,16 @@
 /* [R 8] debug only: A bit mask for all PSWHST arbiter clients. '1' means
    this client is waiting for the arbiter. */
 #define PXP_REG_HST_CLIENTS_WAITING_TO_ARB			 0x103008
+/* [RW 1] When 1; doorbells are discarded and not passed to doorbell queue
+   block. Should be used for close the gates. */
+#define PXP_REG_HST_DISCARD_DOORBELLS				 0x1030a4
 /* [R 1] debug only: '1' means this PSWHST is discarding doorbells. This bit
    should update accoring to 'hst_discard_doorbells' register when the state
    machine is idle */
 #define PXP_REG_HST_DISCARD_DOORBELLS_STATUS			 0x1030a0
+/* [RW 1] When 1; new internal writes arriving to the block are discarded.
+   Should be used for close the gates. */
+#define PXP_REG_HST_DISCARD_INTERNAL_WRITES			 0x1030a8
 /* [R 6] debug only: A bit mask for all PSWHST internal write clients. '1'
    means this PSWHST is discarding inputs from this client. Each bit should
    update accoring to 'hst_discard_internal_writes' register when the state
@@ -4422,11 +4432,21 @@
 #define MISC_REGISTERS_GPIO_PORT_SHIFT				 4
 #define MISC_REGISTERS_GPIO_SET_POS				 8
 #define MISC_REGISTERS_RESET_REG_1_CLEAR			 0x588
+#define MISC_REGISTERS_RESET_REG_1_RST_HC			 (0x1<<29)
 #define MISC_REGISTERS_RESET_REG_1_RST_NIG			 (0x1<<7)
+#define MISC_REGISTERS_RESET_REG_1_RST_PXP			 (0x1<<26)
+#define MISC_REGISTERS_RESET_REG_1_RST_PXPV			 (0x1<<27)
 #define MISC_REGISTERS_RESET_REG_1_SET				 0x584
 #define MISC_REGISTERS_RESET_REG_2_CLEAR			 0x598
 #define MISC_REGISTERS_RESET_REG_2_RST_BMAC0			 (0x1<<0)
 #define MISC_REGISTERS_RESET_REG_2_RST_EMAC0_HARD_CORE		 (0x1<<14)
+#define MISC_REGISTERS_RESET_REG_2_RST_EMAC1_HARD_CORE		 (0x1<<15)
+#define MISC_REGISTERS_RESET_REG_2_RST_GRC			 (0x1<<4)
+#define MISC_REGISTERS_RESET_REG_2_RST_MCP_N_HARD_CORE_RST_B	 (0x1<<6)
+#define MISC_REGISTERS_RESET_REG_2_RST_MCP_N_RESET_REG_HARD_CORE (0x1<<5)
+#define MISC_REGISTERS_RESET_REG_2_RST_MDIO			 (0x1<<13)
+#define MISC_REGISTERS_RESET_REG_2_RST_MISC_CORE		 (0x1<<11)
+#define MISC_REGISTERS_RESET_REG_2_RST_RBCN			 (0x1<<9)
 #define MISC_REGISTERS_RESET_REG_2_SET				 0x594
 #define MISC_REGISTERS_RESET_REG_3_CLEAR			 0x5a8
 #define MISC_REGISTERS_RESET_REG_3_MISC_NIG_MUX_SERDES0_IDDQ	 (0x1<<1)
@@ -4454,6 +4474,7 @@
 #define HW_LOCK_RESOURCE_GPIO					 1
 #define HW_LOCK_RESOURCE_MDIO					 0
 #define HW_LOCK_RESOURCE_PORT0_ATT_MASK 			 3
+#define HW_LOCK_RESOURCE_RESERVED_08				 8
 #define HW_LOCK_RESOURCE_SPIO					 2
 #define HW_LOCK_RESOURCE_UNDI					 5
 #define PRS_FLAG_OVERETH_IPV4					 1
@@ -4474,6 +4495,10 @@
 #define AEU_INPUTS_ATTN_BITS_GPIO3_FUNCTION_0		      (1<<5)
 #define AEU_INPUTS_ATTN_BITS_GPIO3_FUNCTION_1		      (1<<9)
 #define AEU_INPUTS_ATTN_BITS_IGU_PARITY_ERROR		      (1<<12)
+#define AEU_INPUTS_ATTN_BITS_MCP_LATCHED_ROM_PARITY	      (1<<28)
+#define AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY	      (1<<31)
+#define AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_RX_PARITY	      (1<<29)
+#define AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY	      (1<<30)
 #define AEU_INPUTS_ATTN_BITS_MISC_HW_INTERRUPT		      (1<<15)
 #define AEU_INPUTS_ATTN_BITS_MISC_PARITY_ERROR		      (1<<14)
 #define AEU_INPUTS_ATTN_BITS_PARSER_PARITY_ERROR	      (1<<20)
-- 
1.7.0.4





^ permalink raw reply related

* [PATCH net-next-2.6 2/11] bnx2x: Use VPD-R V0 entry to display firmware revision
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, dmitry

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |    4 ++
 drivers/net/bnx2x_main.c |   71 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index ae9c89e..c2bef7a 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -1075,6 +1075,7 @@ struct bnx2x {
 #define INIT_CSEM_INT_TABLE_DATA(bp)	(bp->csem_int_table_data)
 #define INIT_CSEM_PRAM_DATA(bp)		(bp->csem_pram_data)
 
+	char			fw_ver[32];
 	const struct firmware	*firmware;
 };
 
@@ -1333,6 +1334,9 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 #define PXP2_REG_PXP2_INT_STS		PXP2_REG_PXP2_INT_STS_0
 #endif
 
+#define BNX2X_VPD_LEN			128
+#define VENDOR_ID_LEN			4
+
 /* MISC_REG_RESET_REG - this is here for the hsi to work don't touch */
 
 #endif /* bnx2x.h */
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 63a17d6..278e128 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -8896,6 +8896,70 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
 	return rc;
 }
 
+static void __devinit bnx2x_read_fwinfo(struct bnx2x *bp)
+{
+	int cnt, i, block_end, rodi;
+	char vpd_data[BNX2X_VPD_LEN+1];
+	char str_id_reg[VENDOR_ID_LEN+1];
+	char str_id_cap[VENDOR_ID_LEN+1];
+	u8 len;
+
+	cnt = pci_read_vpd(bp->pdev, 0, BNX2X_VPD_LEN, vpd_data);
+	memset(bp->fw_ver, 0, sizeof(bp->fw_ver));
+
+	if (cnt < BNX2X_VPD_LEN)
+		goto out_not_found;
+
+	i = pci_vpd_find_tag(vpd_data, 0, BNX2X_VPD_LEN,
+			     PCI_VPD_LRDT_RO_DATA);
+	if (i < 0)
+		goto out_not_found;
+
+
+	block_end = i + PCI_VPD_LRDT_TAG_SIZE +
+		    pci_vpd_lrdt_size(&vpd_data[i]);
+
+	i += PCI_VPD_LRDT_TAG_SIZE;
+
+	if (block_end > BNX2X_VPD_LEN)
+		goto out_not_found;
+
+	rodi = pci_vpd_find_info_keyword(vpd_data, i, block_end,
+				   PCI_VPD_RO_KEYWORD_MFR_ID);
+	if (rodi < 0)
+		goto out_not_found;
+
+	len = pci_vpd_info_field_size(&vpd_data[rodi]);
+
+	if (len != VENDOR_ID_LEN)
+		goto out_not_found;
+
+	rodi += PCI_VPD_INFO_FLD_HDR_SIZE;
+
+	/* vendor specific info */
+	snprintf(str_id_reg, VENDOR_ID_LEN + 1, "%04x", PCI_VENDOR_ID_DELL);
+	snprintf(str_id_cap, VENDOR_ID_LEN + 1, "%04X", PCI_VENDOR_ID_DELL);
+	if (!strncmp(str_id_reg, &vpd_data[rodi], VENDOR_ID_LEN) ||
+	    !strncmp(str_id_cap, &vpd_data[rodi], VENDOR_ID_LEN)) {
+
+		rodi = pci_vpd_find_info_keyword(vpd_data, i, block_end,
+						PCI_VPD_RO_KEYWORD_VENDOR0);
+		if (rodi >= 0) {
+			len = pci_vpd_info_field_size(&vpd_data[rodi]);
+
+			rodi += PCI_VPD_INFO_FLD_HDR_SIZE;
+
+			if (len < 32 && (len + rodi) <= BNX2X_VPD_LEN) {
+				memcpy(bp->fw_ver, &vpd_data[rodi], len);
+				bp->fw_ver[len] = ' ';
+			}
+		}
+		return;
+	}
+out_not_found:
+	return;
+}
+
 static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 {
 	int func = BP_FUNC(bp);
@@ -8917,6 +8981,7 @@ static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 
 	rc = bnx2x_get_hwinfo(bp);
 
+	bnx2x_read_fwinfo(bp);
 	/* need to reset chip if undi was active */
 	if (!BP_NOMCP(bp))
 		bnx2x_undi_unload(bp);
@@ -9307,11 +9372,13 @@ static void bnx2x_get_drvinfo(struct net_device *dev,
 		bnx2x_release_phy_lock(bp);
 	}
 
-	snprintf(info->fw_version, 32, "BC:%d.%d.%d%s%s",
+	strncpy(info->fw_version, bp->fw_ver, 32);
+	snprintf(info->fw_version + strlen(bp->fw_ver), 32 - strlen(bp->fw_ver),
+		 "bc %d.%d.%d%s%s",
 		 (bp->common.bc_ver & 0xff0000) >> 16,
 		 (bp->common.bc_ver & 0xff00) >> 8,
 		 (bp->common.bc_ver & 0xff),
-		 ((phy_fw_ver[0] != '\0') ? " PHY:" : ""), phy_fw_ver);
+		 ((phy_fw_ver[0] != '\0') ? " phy " : ""), phy_fw_ver);
 	strcpy(info->bus_info, pci_name(bp->pdev));
 	info->n_stats = BNX2X_NUM_STATS;
 	info->testinfo_len = BNX2X_NUM_TESTS;
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 3/11] bnx2x: Increase DMAE max write size for 57711
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Increase DMAE max write size for 57711 to the maximum allowed value.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |    2 +-
 drivers/net/bnx2x_main.c |    9 +++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index 0706c2c..fc00f79 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -1165,7 +1165,7 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
 #define DMAE_CMD_E1HVN_SHIFT		DMAE_COMMAND_E1HVN_SHIFT
 
 #define DMAE_LEN32_RD_MAX		0x80
-#define DMAE_LEN32_WR_MAX		0x400
+#define DMAE_LEN32_WR_MAX(bp)		(CHIP_IS_E1(bp) ? 0x400 : 0x2000)
 
 #define DMAE_COMP_VAL			0xe0d0d0ae
 
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 33d7484..da89cb0 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -352,13 +352,14 @@ void bnx2x_read_dmae(struct bnx2x *bp, u32 src_addr, u32 len32)
 void bnx2x_write_dmae_phys_len(struct bnx2x *bp, dma_addr_t phys_addr,
 			       u32 addr, u32 len)
 {
+	int dmae_wr_max = DMAE_LEN32_WR_MAX(bp);
 	int offset = 0;
 
-	while (len > DMAE_LEN32_WR_MAX) {
+	while (len > dmae_wr_max) {
 		bnx2x_write_dmae(bp, phys_addr + offset,
-				 addr + offset, DMAE_LEN32_WR_MAX);
-		offset += DMAE_LEN32_WR_MAX * 4;
-		len -= DMAE_LEN32_WR_MAX;
+				 addr + offset, dmae_wr_max);
+		offset += dmae_wr_max * 4;
+		len -= dmae_wr_max;
 	}
 
 	bnx2x_write_dmae(bp, phys_addr + offset, addr + offset, len);
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 4/11] bnx2x: Protect code with NOMCP
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Don't run code that can't be run if MCP is not present.
This will prevent NULL pointer dereferencing.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |   44 ++++++++++++++++++++++++++++++++------------
 1 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 298d3e5..4861ee3 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -513,6 +513,10 @@ static void bnx2x_fw_dump(struct bnx2x *bp)
 	__be32 data[9];
 	int word;
 
+	if (BP_NOMCP(bp)) {
+		BNX2X_ERR("NO MCP - can not dump\n");
+		return;
+	}
 	mark = REG_RD(bp, MCP_REG_MCPR_SCRATCH + 0xf104);
 	mark = ((mark + 0x3) & ~0x3);
 	pr_err("begin fw dump (mark 0x%x)\n", mark);
@@ -2279,11 +2283,14 @@ static void bnx2x__link_reset(struct bnx2x *bp)
 
 static u8 bnx2x_link_test(struct bnx2x *bp)
 {
-	u8 rc;
+	u8 rc = 0;
 
-	bnx2x_acquire_phy_lock(bp);
-	rc = bnx2x_test_link(&bp->link_params, &bp->link_vars);
-	bnx2x_release_phy_lock(bp);
+	if (!BP_NOMCP(bp)) {
+		bnx2x_acquire_phy_lock(bp);
+		rc = bnx2x_test_link(&bp->link_params, &bp->link_vars);
+		bnx2x_release_phy_lock(bp);
+	} else
+		BNX2X_ERR("Bootcode is missing - can not test link\n");
 
 	return rc;
 }
@@ -4275,7 +4282,6 @@ static int bnx2x_hw_stats_update(struct bnx2x *bp)
 		u32 lo;
 		u32 hi;
 	} diff;
-	u32 nig_timer_max;
 
 	if (bp->link_vars.mac_type == MAC_TYPE_BMAC)
 		bnx2x_bmac_stats_update(bp);
@@ -4306,10 +4312,14 @@ static int bnx2x_hw_stats_update(struct bnx2x *bp)
 
 	pstats->host_port_stats_start = ++pstats->host_port_stats_end;
 
-	nig_timer_max = SHMEM_RD(bp, port_mb[BP_PORT(bp)].stat_nig_timer);
-	if (nig_timer_max != estats->nig_timer_max) {
-		estats->nig_timer_max = nig_timer_max;
-		BNX2X_ERR("NIG timer max (%u)\n", estats->nig_timer_max);
+	if (!BP_NOMCP(bp)) {
+		u32 nig_timer_max =
+			SHMEM_RD(bp, port_mb[BP_PORT(bp)].stat_nig_timer);
+		if (nig_timer_max != estats->nig_timer_max) {
+			estats->nig_timer_max = nig_timer_max;
+			BNX2X_ERR("NIG timer max (%u)\n",
+				  estats->nig_timer_max);
+		}
 	}
 
 	return 0;
@@ -6376,10 +6386,14 @@ static void bnx2x_init_pxp(struct bnx2x *bp)
 
 static void bnx2x_setup_fan_failure_detection(struct bnx2x *bp)
 {
+	int is_required;
 	u32 val;
-	u8 port;
-	u8 is_required = 0;
+	int port;
 
+	if (BP_NOMCP(bp))
+		return;
+
+	is_required = 0;
 	val = SHMEM_RD(bp, dev_info.shared_hw_config.config2) &
 	      SHARED_HW_CFG_FAN_FAILURE_MASK;
 
@@ -9687,7 +9701,7 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
 
 	bp->e1hov = 0;
 	bp->e1hmf = 0;
-	if (CHIP_IS_E1H(bp)) {
+	if (CHIP_IS_E1H(bp) && !BP_NOMCP(bp)) {
 		bp->mf_config =
 			SHMEM_RD(bp, mf_cfg.func_mf_config[func].config);
 
@@ -11362,6 +11376,9 @@ static int bnx2x_test_loopback(struct bnx2x *bp, u8 link_up)
 {
 	int rc = 0, res;
 
+	if (BP_NOMCP(bp))
+		return rc;
+
 	if (!netif_running(bp->dev))
 		return BNX2X_LOOPBACK_FAILED;
 
@@ -11409,6 +11426,9 @@ static int bnx2x_test_nvram(struct bnx2x *bp)
 	int i, rc;
 	u32 magic, crc;
 
+	if (BP_NOMCP(bp))
+		return 0;
+
 	rc = bnx2x_nvram_read(bp, 0, data, 4);
 	if (rc) {
 		DP(NETIF_MSG_PROBE, "magic value read (rc %d)\n", rc);
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 5/11] bnx2x: White spaces
From: Vladislav Zolotarov @ 2010-04-19 11:13 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

White spaces, code readability and prints.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |    9 +-
 drivers/net/bnx2x_main.c |  366 +++++++++++++++++++++++++-------------------
 2 files changed, 214 insertions(+), 162 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index fc00f79..236235f 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -83,7 +83,12 @@ do {								\
 	       __func__, __LINE__,				\
 	       bp->dev ? (bp->dev->name) : "?",			\
 	       ##__args);					\
-} while (0)
+	} while (0)
+
+#define BNX2X_ERROR(__fmt, __args...) do { \
+	pr_err("[%s:%d]" __fmt, __func__, __LINE__, ##__args); \
+	} while (0)
+
 
 /* before we have a dev->name use dev_info() */
 #define BNX2X_DEV_INFO(__fmt, __args...)			 \
@@ -972,6 +977,8 @@ struct bnx2x {
 	u16			rx_quick_cons_trip;
 	u16			rx_ticks_int;
 	u16			rx_ticks;
+/* Maximal coalescing timeout in us */
+#define BNX2X_MAX_COALESCE_TOUT		(0xf0*12)
 
 	u32			lin_cnt;
 
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 4861ee3..4bf8013 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -102,7 +102,8 @@ MODULE_PARM_DESC(disable_tpa, " Disable the TPA (LRO) feature");
 
 static int int_mode;
 module_param(int_mode, int, 0);
-MODULE_PARM_DESC(int_mode, " Force interrupt mode (1 INT#x; 2 MSI)");
+MODULE_PARM_DESC(int_mode, " Force interrupt mode other then MSI-X "
+				"(1 INT#x; 2 MSI)");
 
 static int dropless_fc;
 module_param(dropless_fc, int, 0);
@@ -509,6 +510,7 @@ static int bnx2x_mc_assert(struct bnx2x *bp)
 
 static void bnx2x_fw_dump(struct bnx2x *bp)
 {
+	u32 addr;
 	u32 mark, offset;
 	__be32 data[9];
 	int word;
@@ -517,22 +519,22 @@ static void bnx2x_fw_dump(struct bnx2x *bp)
 		BNX2X_ERR("NO MCP - can not dump\n");
 		return;
 	}
-	mark = REG_RD(bp, MCP_REG_MCPR_SCRATCH + 0xf104);
-	mark = ((mark + 0x3) & ~0x3);
+
+	addr = bp->common.shmem_base - 0x0800 + 4;
+	mark = REG_RD(bp, addr);
+	mark = MCP_REG_MCPR_SCRATCH + ((mark + 0x3) & ~0x3) - 0x08000000;
 	pr_err("begin fw dump (mark 0x%x)\n", mark);
 
 	pr_err("");
-	for (offset = mark - 0x08000000; offset <= 0xF900; offset += 0x8*4) {
+	for (offset = mark; offset <= bp->common.shmem_base; offset += 0x8*4) {
 		for (word = 0; word < 8; word++)
-			data[word] = htonl(REG_RD(bp, MCP_REG_MCPR_SCRATCH +
-						  offset + 4*word));
+			data[word] = htonl(REG_RD(bp, offset + 4*word));
 		data[8] = 0x0;
 		pr_cont("%s", (char *)data);
 	}
-	for (offset = 0xF108; offset <= mark - 0x08000000; offset += 0x8*4) {
+	for (offset = addr + 4; offset <= mark; offset += 0x8*4) {
 		for (word = 0; word < 8; word++)
-			data[word] = htonl(REG_RD(bp, MCP_REG_MCPR_SCRATCH +
-						  offset + 4*word));
+			data[word] = htonl(REG_RD(bp, offset + 4*word));
 		data[8] = 0x0;
 		pr_cont("%s", (char *)data);
 	}
@@ -551,9 +553,9 @@ static void bnx2x_panic_dump(struct bnx2x *bp)
 
 	/* Indices */
 	/* Common */
-	BNX2X_ERR("def_c_idx(%u)  def_u_idx(%u)  def_x_idx(%u)"
-		  "  def_t_idx(%u)  def_att_idx(%u)  attn_state(%u)"
-		  "  spq_prod_idx(%u)\n",
+	BNX2X_ERR("def_c_idx(0x%x)  def_u_idx(0x%x)  def_x_idx(0x%x)"
+		  "  def_t_idx(0x%x)  def_att_idx(0x%x)  attn_state(0x%x)"
+		  "  spq_prod_idx(0x%x)\n",
 		  bp->def_c_idx, bp->def_u_idx, bp->def_x_idx, bp->def_t_idx,
 		  bp->def_att_idx, bp->attn_state, bp->spq_prod_idx);
 
@@ -561,14 +563,14 @@ static void bnx2x_panic_dump(struct bnx2x *bp)
 	for_each_queue(bp, i) {
 		struct bnx2x_fastpath *fp = &bp->fp[i];
 
-		BNX2X_ERR("fp%d: rx_bd_prod(%x)  rx_bd_cons(%x)"
-			  "  *rx_bd_cons_sb(%x)  rx_comp_prod(%x)"
-			  "  rx_comp_cons(%x)  *rx_cons_sb(%x)\n",
+		BNX2X_ERR("fp%d: rx_bd_prod(0x%x)  rx_bd_cons(0x%x)"
+			  "  *rx_bd_cons_sb(0x%x)  rx_comp_prod(0x%x)"
+			  "  rx_comp_cons(0x%x)  *rx_cons_sb(0x%x)\n",
 			  i, fp->rx_bd_prod, fp->rx_bd_cons,
 			  le16_to_cpu(*fp->rx_bd_cons_sb), fp->rx_comp_prod,
 			  fp->rx_comp_cons, le16_to_cpu(*fp->rx_cons_sb));
-		BNX2X_ERR("      rx_sge_prod(%x)  last_max_sge(%x)"
-			  "  fp_u_idx(%x) *sb_u_idx(%x)\n",
+		BNX2X_ERR("     rx_sge_prod(0x%x)  last_max_sge(0x%x)"
+			  "  fp_u_idx(0x%x) *sb_u_idx(0x%x)\n",
 			  fp->rx_sge_prod, fp->last_max_sge,
 			  le16_to_cpu(fp->fp_u_idx),
 			  fp->status_blk->u_status_block.status_block_index);
@@ -578,12 +580,13 @@ static void bnx2x_panic_dump(struct bnx2x *bp)
 	for_each_queue(bp, i) {
 		struct bnx2x_fastpath *fp = &bp->fp[i];
 
-		BNX2X_ERR("fp%d: tx_pkt_prod(%x)  tx_pkt_cons(%x)"
-			  "  tx_bd_prod(%x)  tx_bd_cons(%x)  *tx_cons_sb(%x)\n",
+		BNX2X_ERR("fp%d: tx_pkt_prod(0x%x)  tx_pkt_cons(0x%x)"
+			  "  tx_bd_prod(0x%x)  tx_bd_cons(0x%x)"
+			  "  *tx_cons_sb(0x%x)\n",
 			  i, fp->tx_pkt_prod, fp->tx_pkt_cons, fp->tx_bd_prod,
 			  fp->tx_bd_cons, le16_to_cpu(*fp->tx_cons_sb));
-		BNX2X_ERR("      fp_c_idx(%x)  *sb_c_idx(%x)"
-			  "  tx_db_prod(%x)\n", le16_to_cpu(fp->fp_c_idx),
+		BNX2X_ERR("     fp_c_idx(0x%x)  *sb_c_idx(0x%x)"
+			  "  tx_db_prod(0x%x)\n", le16_to_cpu(fp->fp_c_idx),
 			  fp->status_blk->c_status_block.status_block_index,
 			  fp->tx_db.data.prod);
 	}
@@ -1050,7 +1053,8 @@ static void bnx2x_sp_event(struct bnx2x_fastpath *fp,
 
 		default:
 			BNX2X_ERR("unexpected MC reply (%d)  "
-				  "fp->state is %x\n", command, fp->state);
+				  "fp[%d] state is %x\n",
+				  command, fp->index, fp->state);
 			break;
 		}
 		mb(); /* force bnx2x_wait_ramrod() to see the change */
@@ -1329,7 +1333,7 @@ static void bnx2x_tpa_start(struct bnx2x_fastpath *fp, u16 queue,
 
 #ifdef BNX2X_STOP_ON_ERROR
 	fp->tpa_queue_used |= (1 << queue);
-#ifdef __powerpc64__
+#ifdef _ASM_GENERIC_INT_L64_H
 	DP(NETIF_MSG_RX_STATUS, "fp->tpa_queue_used = 0x%lx\n",
 #else
 	DP(NETIF_MSG_RX_STATUS, "fp->tpa_queue_used = 0x%llx\n",
@@ -1358,8 +1362,7 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 					       max(frag_size, (u32)len_on_bd));
 
 #ifdef BNX2X_STOP_ON_ERROR
-	if (pages >
-	    min((u32)8, (u32)MAX_SKB_FRAGS) * SGE_PAGE_SIZE * PAGES_PER_SGE) {
+	if (pages > min_t(u32, 8, MAX_SKB_FRAGS)*SGE_PAGE_SIZE*PAGES_PER_SGE) {
 		BNX2X_ERR("SGL length is too long: %d. CQE index is %d\n",
 			  pages, cqe_idx);
 		BNX2X_ERR("fp_cqe->pkt_len = %d  fp_cqe->len_on_bd = %d\n",
@@ -1858,8 +1861,8 @@ static irqreturn_t bnx2x_interrupt(int irq, void *dev_instance)
 			return IRQ_HANDLED;
 	}
 
-	if (status)
-		DP(NETIF_MSG_INTR, "got an unknown interrupt! (status %u)\n",
+	if (unlikely(status))
+		DP(NETIF_MSG_INTR, "got an unknown interrupt! (status 0x%x)\n",
 		   status);
 
 	return IRQ_HANDLED;
@@ -2419,10 +2422,10 @@ static void bnx2x_init_vn_minmax(struct bnx2x *bp, int func)
 		   T_FAIR_COEF / (8 * vn_weight_sum) will always be greater
 		   than zero */
 		m_fair_vn.vn_credit_delta =
-			max((u32)(vn_min_rate * (T_FAIR_COEF /
-						 (8 * bp->vn_weight_sum))),
-			    (u32)(bp->cmng.fair_vars.fair_threshold * 2));
-		DP(NETIF_MSG_IFUP, "m_fair_vn.vn_credit_delta=%d\n",
+			max_t(u32, (vn_min_rate * (T_FAIR_COEF /
+						   (8 * bp->vn_weight_sum))),
+			      (bp->cmng.fair_vars.fair_threshold * 2));
+		DP(NETIF_MSG_IFUP, "m_fair_vn.vn_credit_delta %d\n",
 		   m_fair_vn.vn_credit_delta);
 	}
 
@@ -2592,7 +2595,6 @@ u32 bnx2x_fw_command(struct bnx2x *bp, u32 command)
 	return rc;
 }
 
-static void bnx2x_set_storm_rx_mode(struct bnx2x *bp);
 static void bnx2x_set_eth_mac_addr_e1h(struct bnx2x *bp, int set);
 static void bnx2x_set_rx_mode(struct net_device *dev);
 
@@ -2728,12 +2730,6 @@ static int bnx2x_sp_post(struct bnx2x *bp, int command, int cid,
 {
 	struct eth_spe *spe;
 
-	DP(BNX2X_MSG_SP/*NETIF_MSG_TIMER*/,
-	   "SPQE (%x:%x)  command %d  hw_cid %x  data (%x:%x)  left %x\n",
-	   (u32)U64_HI(bp->spq_mapping), (u32)(U64_LO(bp->spq_mapping) +
-	   (void *)bp->spq_prod_bd - (void *)bp->spq), command,
-	   HW_CID(bp, cid), data_hi, data_lo, bp->spq_left);
-
 #ifdef BNX2X_STOP_ON_ERROR
 	if (unlikely(bp->panic))
 		return -EIO;
@@ -2752,8 +2748,8 @@ static int bnx2x_sp_post(struct bnx2x *bp, int command, int cid,
 
 	/* CID needs port number to be encoded int it */
 	spe->hdr.conn_and_cmd_data =
-			cpu_to_le32(((command << SPE_HDR_CMD_ID_SHIFT) |
-				     HW_CID(bp, cid)));
+			cpu_to_le32((command << SPE_HDR_CMD_ID_SHIFT) |
+				    HW_CID(bp, cid));
 	spe->hdr.type = cpu_to_le16(ETH_CONNECTION_TYPE);
 	if (common)
 		spe->hdr.type |=
@@ -2764,6 +2760,13 @@ static int bnx2x_sp_post(struct bnx2x *bp, int command, int cid,
 
 	bp->spq_left--;
 
+	DP(BNX2X_MSG_SP/*NETIF_MSG_TIMER*/,
+	   "SPQE[%x] (%x:%x)  command %d  hw_cid %x  data (%x:%x)  left %x\n",
+	   bp->spq_prod_idx, (u32)U64_HI(bp->spq_mapping),
+	   (u32)(U64_LO(bp->spq_mapping) +
+	   (void *)bp->spq_prod_bd - (void *)bp->spq), command,
+	   HW_CID(bp, cid), data_hi, data_lo, bp->spq_left);
+
 	bnx2x_sp_prod_update(bp);
 	spin_unlock_bh(&bp->spq_lock);
 	return 0;
@@ -2939,8 +2942,9 @@ static inline void bnx2x_fan_failure(struct bnx2x *bp)
 		 bp->link_params.ext_phy_config);
 
 	/* log the failure */
-	netdev_err(bp->dev, "Fan Failure on Network Controller has caused the driver to shutdown the card to prevent permanent damage.\n"
-		   "Please contact Dell Support for assistance.\n");
+	netdev_err(bp->dev, "Fan Failure on Network Controller has caused"
+	       " the driver to shutdown the card to prevent permanent"
+	       " damage.  Please contact OEM Support for assistance\n");
 }
 
 static inline void bnx2x_attn_int_deasserted0(struct bnx2x *bp, u32 attn)
@@ -3562,11 +3566,23 @@ static void bnx2x_sp_task(struct work_struct *work)
 /*	if (status == 0)				     */
 /*		BNX2X_ERR("spurious slowpath interrupt!\n"); */
 
-	DP(NETIF_MSG_INTR, "got a slowpath interrupt (updated %x)\n", status);
+	DP(NETIF_MSG_INTR, "got a slowpath interrupt (status 0x%x)\n", status);
 
 	/* HW attentions */
-	if (status & 0x1)
+	if (status & 0x1) {
 		bnx2x_attn_int(bp);
+		status &= ~0x1;
+	}
+
+	/* CStorm events: STAT_QUERY */
+	if (status & 0x2) {
+		DP(BNX2X_MSG_SP, "CStorm events: STAT_QUERY\n");
+		status &= ~0x2;
+	}
+
+	if (unlikely(status))
+		DP(NETIF_MSG_INTR, "got an unknown interrupt! (status 0x%x)\n",
+		   status);
 
 	bnx2x_ack_sb(bp, DEF_SB_ID, ATTENTION_ID, le16_to_cpu(bp->def_att_idx),
 		     IGU_INT_NOP, 1);
@@ -3578,7 +3594,6 @@ static void bnx2x_sp_task(struct work_struct *work)
 		     IGU_INT_NOP, 1);
 	bnx2x_ack_sb(bp, DEF_SB_ID, TSTORM_ID, le16_to_cpu(bp->def_t_idx),
 		     IGU_INT_ENABLE, 1);
-
 }
 
 static irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance)
@@ -4363,21 +4378,21 @@ static int bnx2x_storm_stats_update(struct bnx2x *bp)
 		if ((u16)(le16_to_cpu(xclient->stats_counter) + 1) !=
 							bp->stats_counter) {
 			DP(BNX2X_MSG_STATS, "[%d] stats not updated by xstorm"
-			   "  xstorm counter (%d) != stats_counter (%d)\n",
+			   "  xstorm counter (0x%x) != stats_counter (0x%x)\n",
 			   i, xclient->stats_counter, bp->stats_counter);
 			return -1;
 		}
 		if ((u16)(le16_to_cpu(tclient->stats_counter) + 1) !=
 							bp->stats_counter) {
 			DP(BNX2X_MSG_STATS, "[%d] stats not updated by tstorm"
-			   "  tstorm counter (%d) != stats_counter (%d)\n",
+			   "  tstorm counter (0x%x) != stats_counter (0x%x)\n",
 			   i, tclient->stats_counter, bp->stats_counter);
 			return -2;
 		}
 		if ((u16)(le16_to_cpu(uclient->stats_counter) + 1) !=
 							bp->stats_counter) {
 			DP(BNX2X_MSG_STATS, "[%d] stats not updated by ustorm"
-			   "  ustorm counter (%d) != stats_counter (%d)\n",
+			   "  ustorm counter (0x%x) != stats_counter (0x%x)\n",
 			   i, uclient->stats_counter, bp->stats_counter);
 			return -4;
 		}
@@ -4806,6 +4821,9 @@ static void bnx2x_stats_handle(struct bnx2x *bp, enum bnx2x_stats_event event)
 {
 	enum bnx2x_stats_state state = bp->stats_state;
 
+	if (unlikely(bp->panic))
+		return;
+
 	bnx2x_stats_stm[state][event].action(bp);
 	bp->stats_state = bnx2x_stats_stm[state][event].next_state;
 
@@ -5410,8 +5428,8 @@ static void bnx2x_init_rx_rings(struct bnx2x *bp)
 
 		fp->rx_bd_prod = ring_prod;
 		/* must not have more available CQEs than BDs */
-		fp->rx_comp_prod = min((u16)(NUM_RCQ_RINGS*RCQ_DESC_CNT),
-				       cqe_ring_prod);
+		fp->rx_comp_prod = min_t(u16, NUM_RCQ_RINGS*RCQ_DESC_CNT,
+					 cqe_ring_prod);
 		fp->rx_pkt = fp->rx_calls = 0;
 
 		/* Warning!
@@ -5517,8 +5535,8 @@ static void bnx2x_init_context(struct bnx2x *bp)
 			context->ustorm_st_context.common.flags |=
 				USTORM_ETH_ST_CONTEXT_CONFIG_ENABLE_TPA;
 			context->ustorm_st_context.common.sge_buff_size =
-				(u16)min((u32)SGE_PAGE_SIZE*PAGES_PER_SGE,
-					 (u32)0xffff);
+				(u16)min_t(u32, SGE_PAGE_SIZE*PAGES_PER_SGE,
+					   0xffff);
 			context->ustorm_st_context.common.sge_page_base_hi =
 						U64_HI(fp->rx_sge_mapping);
 			context->ustorm_st_context.common.sge_page_base_lo =
@@ -5815,10 +5833,8 @@ static void bnx2x_init_internal_func(struct bnx2x *bp)
 	}
 
 	/* Init CQ ring mapping and aggregation size, the FW limit is 8 frags */
-	max_agg_size =
-		min((u32)(min((u32)8, (u32)MAX_SKB_FRAGS) *
-			  SGE_PAGE_SIZE * PAGES_PER_SGE),
-		    (u32)0xffff);
+	max_agg_size = min_t(u32, (min_t(u32, 8, MAX_SKB_FRAGS) *
+				   SGE_PAGE_SIZE * PAGES_PER_SGE), 0xffff);
 	for_each_queue(bp, i) {
 		struct bnx2x_fastpath *fp = &bp->fp[i];
 
@@ -5904,7 +5920,7 @@ static void bnx2x_init_internal_func(struct bnx2x *bp)
 	}
 
 
-	/* Store it to internal memory */
+	/* Store cmng structures to internal memory */
 	if (bp->port.pmf)
 		for (i = 0; i < sizeof(struct cmng_struct_per_port) / 4; i++)
 			REG_WR(bp, BAR_XSTRORM_INTMEM +
@@ -6022,7 +6038,8 @@ gunzip_nomem2:
 	bp->gunzip_buf = NULL;
 
 gunzip_nomem1:
-	netdev_err(bp->dev, "Cannot allocate firmware buffer for un-compression\n");
+	netdev_err(bp->dev, "Cannot allocate firmware buffer for"
+	       " un-compression\n");
 	return -ENOMEM;
 }
 
@@ -6073,8 +6090,9 @@ static int bnx2x_gunzip(struct bnx2x *bp, const u8 *zbuf, int len)
 
 	bp->gunzip_outlen = (FW_BUF_SIZE - bp->strm->avail_out);
 	if (bp->gunzip_outlen & 0x3)
-		netdev_err(bp->dev, "Firmware decompression error: gunzip_outlen (%d) not aligned\n",
-			   bp->gunzip_outlen);
+		netdev_err(bp->dev, "Firmware decompression error:"
+				    " gunzip_outlen (%d) not aligned\n",
+				bp->gunzip_outlen);
 	bp->gunzip_outlen >>= 2;
 
 	zlib_inflateEnd(bp->strm);
@@ -6432,7 +6450,7 @@ static void bnx2x_setup_fan_failure_detection(struct bnx2x *bp)
 	/* set to active low mode */
 	val = REG_RD(bp, MISC_REG_SPIO_INT);
 	val |= ((1 << MISC_REGISTERS_SPIO_5) <<
-				MISC_REGISTERS_SPIO_INT_OLD_SET_POS);
+					MISC_REGISTERS_SPIO_INT_OLD_SET_POS);
 	REG_WR(bp, MISC_REG_SPIO_INT, val);
 
 	/* enable interrupt to signal the IGU */
@@ -6619,7 +6637,8 @@ static int bnx2x_init_common(struct bnx2x *bp)
 
 	if (sizeof(union cdu_context) != 1024)
 		/* we currently assume that a context is 1024 bytes */
-		pr_alert("please adjust the size of cdu_context(%ld)\n",
+		dev_alert(&bp->pdev->dev, "please adjust the size "
+					  "of cdu_context(%ld)\n",
 			 (long)sizeof(union cdu_context));
 
 	bnx2x_init_block(bp, CDU_BLOCK, COMMON_STAGE);
@@ -6723,7 +6742,7 @@ static int bnx2x_init_port(struct bnx2x *bp)
 	u32 low, high;
 	u32 val;
 
-	DP(BNX2X_MSG_MCP, "starting port init  port %x\n", port);
+	DP(BNX2X_MSG_MCP, "starting port init  port %d\n", port);
 
 	REG_WR(bp, NIG_REG_MASK_INTERRUPT_PORT0 + port*4, 0);
 
@@ -6742,6 +6761,7 @@ static int bnx2x_init_port(struct bnx2x *bp)
 	REG_WR(bp, TM_REG_LIN0_SCAN_TIME + port*4, 20);
 	REG_WR(bp, TM_REG_LIN0_MAX_ACTIVE_CID + port*4, 31);
 #endif
+
 	bnx2x_init_block(bp, DQ_BLOCK, init_stage);
 
 	bnx2x_init_block(bp, BRB1_BLOCK, init_stage);
@@ -6934,7 +6954,7 @@ static int bnx2x_init_func(struct bnx2x *bp)
 	u32 addr, val;
 	int i;
 
-	DP(BNX2X_MSG_MCP, "starting func init  func %x\n", func);
+	DP(BNX2X_MSG_MCP, "starting func init  func %d\n", func);
 
 	/* set MSI reconfigure capability */
 	addr = (port ? HC_REG_CONFIG_1 : HC_REG_CONFIG_0);
@@ -7428,10 +7448,11 @@ static int bnx2x_req_msix_irqs(struct bnx2x *bp)
 	}
 
 	i = BNX2X_NUM_QUEUES(bp);
-	netdev_info(bp->dev, "using MSI-X  IRQs: sp %d  fp[%d] %d ... fp[%d] %d\n",
-		    bp->msix_table[0].vector,
-		    0, bp->msix_table[offset].vector,
-		    i - 1, bp->msix_table[offset + i - 1].vector);
+	netdev_info(bp->dev, "using MSI-X  IRQs: sp %d  fp[%d] %d"
+	       " ... fp[%d] %d\n",
+	       bp->msix_table[0].vector,
+	       0, bp->msix_table[offset].vector,
+	       i - 1, bp->msix_table[offset + i - 1].vector);
 
 	return 0;
 }
@@ -9142,7 +9163,7 @@ static void __devinit bnx2x_get_common_hwinfo(struct bnx2x *bp)
 	val = SHMEM_RD(bp, validity_map[BP_PORT(bp)]);
 	if ((val & (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB))
 		!= (SHR_MEM_VALIDITY_DEV_INFO | SHR_MEM_VALIDITY_MB))
-		BNX2X_ERR("BAD MCP validity signature\n");
+		BNX2X_ERROR("BAD MCP validity signature\n");
 
 	bp->common.hw_config = SHMEM_RD(bp, dev_info.shared_hw_config.config);
 	BNX2X_DEV_INFO("hw_config 0x%08x\n", bp->common.hw_config);
@@ -9166,8 +9187,8 @@ static void __devinit bnx2x_get_common_hwinfo(struct bnx2x *bp)
 	if (val < BNX2X_BC_VER) {
 		/* for now only warn
 		 * later we might need to enforce this */
-		BNX2X_ERR("This driver needs bc_ver %X but found %X,"
-			  " please upgrade BC\n", BNX2X_BC_VER, val);
+		BNX2X_ERROR("This driver needs bc_ver %X but found %X, "
+			    "please upgrade BC\n", BNX2X_BC_VER, val);
 	}
 	bp->link_params.feature_config_flags |=
 		(val >= REQ_BC_VER_4_VRFY_OPT_MDL) ?
@@ -9188,7 +9209,8 @@ static void __devinit bnx2x_get_common_hwinfo(struct bnx2x *bp)
 	val3 = SHMEM_RD(bp, dev_info.shared_hw_config.part_num[8]);
 	val4 = SHMEM_RD(bp, dev_info.shared_hw_config.part_num[12]);
 
-	pr_info("part number %X-%X-%X-%X\n", val, val2, val3, val4);
+	dev_info(&bp->pdev->dev, "part number %X-%X-%X-%X\n",
+		 val, val2, val3, val4);
 }
 
 static void __devinit bnx2x_link_settings_supported(struct bnx2x *bp,
@@ -9466,11 +9488,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_10baseT_Full |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9482,11 +9504,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_10baseT_Half |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9497,11 +9519,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_100baseT_Full |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9513,11 +9535,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_100baseT_Half |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9528,11 +9550,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_1000baseT_Full |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9543,11 +9565,11 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_2500baseX_Full |
 						ADVERTISED_TP);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
@@ -9560,19 +9582,19 @@ static void __devinit bnx2x_link_settings_requested(struct bnx2x *bp)
 			bp->port.advertising = (ADVERTISED_10000baseT_Full |
 						ADVERTISED_FIBRE);
 		} else {
-			BNX2X_ERR("NVRAM config error. "
-				  "Invalid link_config 0x%x"
-				  "  speed_cap_mask 0x%x\n",
-				  bp->port.link_config,
-				  bp->link_params.speed_cap_mask);
+			BNX2X_ERROR("NVRAM config error. "
+				    "Invalid link_config 0x%x"
+				    "  speed_cap_mask 0x%x\n",
+				    bp->port.link_config,
+				    bp->link_params.speed_cap_mask);
 			return;
 		}
 		break;
 
 	default:
-		BNX2X_ERR("NVRAM config error. "
-			  "BAD link speed link_config 0x%x\n",
-			  bp->port.link_config);
+		BNX2X_ERROR("NVRAM config error. "
+			    "BAD link speed link_config 0x%x\n",
+			    bp->port.link_config);
 		bp->link_params.req_line_speed = SPEED_AUTO_NEG;
 		bp->port.advertising = bp->port.supported;
 		break;
@@ -9722,14 +9744,14 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
 					       "(0x%04x)\n",
 					       func, bp->e1hov, bp->e1hov);
 			} else {
-				BNX2X_ERR("!!!  No valid E1HOV for func %d,"
-					  "  aborting\n", func);
+				BNX2X_ERROR("No valid E1HOV for func %d,"
+					    "  aborting\n", func);
 				rc = -EPERM;
 			}
 		} else {
 			if (BP_E1HVN(bp)) {
-				BNX2X_ERR("!!!  VN %d in single function mode,"
-					  "  aborting\n", BP_E1HVN(bp));
+				BNX2X_ERROR("VN %d in single function mode,"
+					    "  aborting\n", BP_E1HVN(bp));
 				rc = -EPERM;
 			}
 		}
@@ -9765,7 +9787,7 @@ static int __devinit bnx2x_get_hwinfo(struct bnx2x *bp)
 
 	if (BP_NOMCP(bp)) {
 		/* only supposed to happen on emulation/FPGA */
-		BNX2X_ERR("warning random MAC workaround active\n");
+		BNX2X_ERROR("warning: random MAC workaround active\n");
 		random_ether_addr(bp->dev->dev_addr);
 		memcpy(bp->dev->perm_addr, bp->dev->dev_addr, ETH_ALEN);
 	}
@@ -9934,15 +9956,17 @@ static int __devinit bnx2x_init_bp(struct bnx2x *bp)
 		bnx2x_undi_unload(bp);
 
 	if (CHIP_REV_IS_FPGA(bp))
-		pr_err("FPGA detected\n");
+		dev_err(&bp->pdev->dev, "FPGA detected\n");
 
 	if (BP_NOMCP(bp) && (func == 0))
-		pr_err("MCP disabled, must load devices in order!\n");
+		dev_err(&bp->pdev->dev, "MCP disabled, "
+					"must load devices in order!\n");
 
 	/* Set multi queue mode */
 	if ((multi_mode != ETH_RSS_MODE_DISABLED) &&
 	    ((int_mode == INT_MODE_INTx) || (int_mode == INT_MODE_MSI))) {
-		pr_err("Multi disabled since int_mode requested is not MSI-X\n");
+		dev_err(&bp->pdev->dev, "Multi disabled since int_mode "
+					"requested is not MSI-X\n");
 		multi_mode = ETH_RSS_MODE_DISABLED;
 	}
 	bp->multi_mode = multi_mode;
@@ -10859,19 +10883,18 @@ static int bnx2x_get_coalesce(struct net_device *dev,
 	return 0;
 }
 
-#define BNX2X_MAX_COALES_TOUT  (0xf0*12) /* Maximal coalescing timeout in us */
 static int bnx2x_set_coalesce(struct net_device *dev,
 			      struct ethtool_coalesce *coal)
 {
 	struct bnx2x *bp = netdev_priv(dev);
 
-	bp->rx_ticks = (u16) coal->rx_coalesce_usecs;
-	if (bp->rx_ticks > BNX2X_MAX_COALES_TOUT)
-		bp->rx_ticks = BNX2X_MAX_COALES_TOUT;
+	bp->rx_ticks = (u16)coal->rx_coalesce_usecs;
+	if (bp->rx_ticks > BNX2X_MAX_COALESCE_TOUT)
+		bp->rx_ticks = BNX2X_MAX_COALESCE_TOUT;
 
-	bp->tx_ticks = (u16) coal->tx_coalesce_usecs;
-	if (bp->tx_ticks > BNX2X_MAX_COALES_TOUT)
-		bp->tx_ticks = BNX2X_MAX_COALES_TOUT;
+	bp->tx_ticks = (u16)coal->tx_coalesce_usecs;
+	if (bp->tx_ticks > BNX2X_MAX_COALESCE_TOUT)
+		bp->tx_ticks = BNX2X_MAX_COALESCE_TOUT;
 
 	if (netif_running(dev))
 		bnx2x_update_coalesce(bp);
@@ -11082,9 +11105,9 @@ static int bnx2x_test_registers(struct bnx2x *bp)
 	u32 wr_val = 0;
 	int port = BP_PORT(bp);
 	static const struct {
-		u32  offset0;
-		u32  offset1;
-		u32  mask;
+		u32 offset0;
+		u32 offset1;
+		u32 mask;
 	} reg_tbl[] = {
 /* 0 */		{ BRB1_REG_PAUSE_LOW_THRESHOLD_0,      4, 0x000003ff },
 		{ DORQ_REG_DB_ADDR0,                   4, 0xffffffff },
@@ -11157,9 +11180,13 @@ static int bnx2x_test_registers(struct bnx2x *bp)
 			/* Restore the original register's value */
 			REG_WR(bp, offset, save_val);
 
-			/* verify that value is as expected value */
-			if ((val & mask) != (wr_val & mask))
+			/* verify value is as expected */
+			if ((val & mask) != (wr_val & mask)) {
+				DP(NETIF_MSG_PROBE,
+				   "offset 0x%x: val 0x%x != 0x%x mask 0x%x\n",
+				   offset, val, wr_val, mask);
 				goto test_reg_exit;
+			}
 		}
 	}
 
@@ -11708,7 +11735,7 @@ static int bnx2x_get_sset_count(struct net_device *dev, int stringset)
 	struct bnx2x *bp = netdev_priv(dev);
 	int i, num_stats;
 
-	switch(stringset) {
+	switch (stringset) {
 	case ETH_SS_STATS:
 		if (is_multi(bp)) {
 			num_stats = BNX2X_NUM_Q_STATS * bp->num_queues;
@@ -12869,18 +12896,21 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 
 	rc = pci_enable_device(pdev);
 	if (rc) {
-		pr_err("Cannot enable PCI device, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot enable PCI device, aborting\n");
 		goto err_out;
 	}
 
 	if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM)) {
-		pr_err("Cannot find PCI device base address, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot find PCI device base address, aborting\n");
 		rc = -ENODEV;
 		goto err_out_disable;
 	}
 
 	if (!(pci_resource_flags(pdev, 2) & IORESOURCE_MEM)) {
-		pr_err("Cannot find second PCI device base address, aborting\n");
+		dev_err(&bp->pdev->dev, "Cannot find second PCI device"
+		       " base address, aborting\n");
 		rc = -ENODEV;
 		goto err_out_disable;
 	}
@@ -12888,7 +12908,8 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 	if (atomic_read(&pdev->enable_cnt) == 1) {
 		rc = pci_request_regions(pdev, DRV_MODULE_NAME);
 		if (rc) {
-			pr_err("Cannot obtain PCI resources, aborting\n");
+			dev_err(&bp->pdev->dev,
+				"Cannot obtain PCI resources, aborting\n");
 			goto err_out_disable;
 		}
 
@@ -12898,14 +12918,16 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 
 	bp->pm_cap = pci_find_capability(pdev, PCI_CAP_ID_PM);
 	if (bp->pm_cap == 0) {
-		pr_err("Cannot find power management capability, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot find power management capability, aborting\n");
 		rc = -EIO;
 		goto err_out_release;
 	}
 
 	bp->pcie_cap = pci_find_capability(pdev, PCI_CAP_ID_EXP);
 	if (bp->pcie_cap == 0) {
-		pr_err("Cannot find PCI Express capability, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot find PCI Express capability, aborting\n");
 		rc = -EIO;
 		goto err_out_release;
 	}
@@ -12913,13 +12946,15 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 	if (dma_set_mask(&pdev->dev, DMA_BIT_MASK(64)) == 0) {
 		bp->flags |= USING_DAC_FLAG;
 		if (dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64)) != 0) {
-			pr_err("dma_set_coherent_mask failed, aborting\n");
+			dev_err(&bp->pdev->dev, "dma_set_coherent_mask"
+			       " failed, aborting\n");
 			rc = -EIO;
 			goto err_out_release;
 		}
 
 	} else if (dma_set_mask(&pdev->dev, DMA_BIT_MASK(32)) != 0) {
-		pr_err("System does not support DMA, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"System does not support DMA, aborting\n");
 		rc = -EIO;
 		goto err_out_release;
 	}
@@ -12932,7 +12967,8 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 
 	bp->regview = pci_ioremap_bar(pdev, 0);
 	if (!bp->regview) {
-		pr_err("Cannot map register space, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot map register space, aborting\n");
 		rc = -ENOMEM;
 		goto err_out_release;
 	}
@@ -12941,7 +12977,8 @@ static int __devinit bnx2x_init_dev(struct pci_dev *pdev,
 					min_t(u64, BNX2X_DB_SIZE,
 					      pci_resource_len(pdev, 2)));
 	if (!bp->doorbells) {
-		pr_err("Cannot map doorbell space, aborting\n");
+		dev_err(&bp->pdev->dev,
+			"Cannot map doorbell space, aborting\n");
 		rc = -ENOMEM;
 		goto err_out_unmap;
 	}
@@ -13046,7 +13083,8 @@ static int __devinit bnx2x_check_firmware(struct bnx2x *bp)
 		offset = be32_to_cpu(sections[i].offset);
 		len = be32_to_cpu(sections[i].len);
 		if (offset + len > firmware->size) {
-			pr_err("Section %d length is out of bounds\n", i);
+			dev_err(&bp->pdev->dev,
+				"Section %d length is out of bounds\n", i);
 			return -EINVAL;
 		}
 	}
@@ -13058,7 +13096,8 @@ static int __devinit bnx2x_check_firmware(struct bnx2x *bp)
 
 	for (i = 0; i < be32_to_cpu(fw_hdr->init_ops_offsets.len) / 2; i++) {
 		if (be16_to_cpu(ops_offsets[i]) > num_ops) {
-			pr_err("Section offset %d is out of bounds\n", i);
+			dev_err(&bp->pdev->dev,
+				"Section offset %d is out of bounds\n", i);
 			return -EINVAL;
 		}
 	}
@@ -13070,7 +13109,8 @@ static int __devinit bnx2x_check_firmware(struct bnx2x *bp)
 	    (fw_ver[1] != BCM_5710_FW_MINOR_VERSION) ||
 	    (fw_ver[2] != BCM_5710_FW_REVISION_VERSION) ||
 	    (fw_ver[3] != BCM_5710_FW_ENGINEERING_VERSION)) {
-		pr_err("Bad FW version:%d.%d.%d.%d. Should be %d.%d.%d.%d\n",
+		dev_err(&bp->pdev->dev,
+			"Bad FW version:%d.%d.%d.%d. Should be %d.%d.%d.%d\n",
 		       fw_ver[0], fw_ver[1], fw_ver[2],
 		       fw_ver[3], BCM_5710_FW_MAJOR_VERSION,
 		       BCM_5710_FW_MINOR_VERSION,
@@ -13105,8 +13145,8 @@ static inline void bnx2x_prep_ops(const u8 *_source, u8 *_target, u32 n)
 	for (i = 0, j = 0; i < n/8; i++, j += 2) {
 		tmp = be32_to_cpu(source[j]);
 		target[i].op = (tmp >> 24) & 0xff;
-		target[i].offset =  tmp & 0xffffff;
-		target[i].raw_data = be32_to_cpu(source[j+1]);
+		target[i].offset = tmp & 0xffffff;
+		target[i].raw_data = be32_to_cpu(source[j + 1]);
 	}
 }
 
@@ -13140,20 +13180,24 @@ static int __devinit bnx2x_init_firmware(struct bnx2x *bp, struct device *dev)
 
 	if (CHIP_IS_E1(bp))
 		fw_file_name = FW_FILE_NAME_E1;
-	else
+	else if (CHIP_IS_E1H(bp))
 		fw_file_name = FW_FILE_NAME_E1H;
+	else {
+		dev_err(dev, "Unsupported chip revision\n");
+		return -EINVAL;
+	}
 
-	pr_info("Loading %s\n", fw_file_name);
+	dev_info(dev, "Loading %s\n", fw_file_name);
 
 	rc = request_firmware(&bp->firmware, fw_file_name, dev);
 	if (rc) {
-		pr_err("Can't load firmware file %s\n", fw_file_name);
+		dev_err(dev, "Can't load firmware file %s\n", fw_file_name);
 		goto request_firmware_exit;
 	}
 
 	rc = bnx2x_check_firmware(bp);
 	if (rc) {
-		pr_err("Corrupt firmware file %s\n", fw_file_name);
+		dev_err(dev, "Corrupt firmware file %s\n", fw_file_name);
 		goto request_firmware_exit;
 	}
 
@@ -13212,7 +13256,7 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 	/* dev zeroed in init_etherdev */
 	dev = alloc_etherdev_mq(sizeof(*bp), MAX_CONTEXT);
 	if (!dev) {
-		pr_err("Cannot allocate net device\n");
+		dev_err(&pdev->dev, "Cannot allocate net device\n");
 		return -ENOMEM;
 	}
 
@@ -13234,7 +13278,7 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 	/* Set init arrays */
 	rc = bnx2x_init_firmware(bp, &pdev->dev);
 	if (rc) {
-		pr_err("Error loading firmware\n");
+		dev_err(&pdev->dev, "Error loading firmware\n");
 		goto init_one_exit;
 	}
 
@@ -13245,11 +13289,12 @@ static int __devinit bnx2x_init_one(struct pci_dev *pdev,
 	}
 
 	bnx2x_get_pcie_width_speed(bp, &pcie_width, &pcie_speed);
-	netdev_info(dev, "%s (%c%d) PCI-E x%d %s found at mem %lx, IRQ %d, node addr %pM\n",
-		    board_info[ent->driver_data].name,
-		    (CHIP_REV(bp) >> 12) + 'A', (CHIP_METAL(bp) >> 4),
-		    pcie_width, (pcie_speed == 2) ? "5GHz (Gen2)" : "2.5GHz",
-		    dev->base_addr, bp->pdev->irq, dev->dev_addr);
+	netdev_info(dev, "%s (%c%d) PCI-E x%d %s found at mem %lx,"
+	       " IRQ %d, ", board_info[ent->driver_data].name,
+	       (CHIP_REV(bp) >> 12) + 'A', (CHIP_METAL(bp) >> 4),
+	       pcie_width, (pcie_speed == 2) ? "5GHz (Gen2)" : "2.5GHz",
+	       dev->base_addr, bp->pdev->irq);
+	pr_cont("node addr %pM\n", dev->dev_addr);
 
 	return 0;
 
@@ -13277,7 +13322,7 @@ static void __devexit bnx2x_remove_one(struct pci_dev *pdev)
 	struct bnx2x *bp;
 
 	if (!dev) {
-		pr_err("BAD net device from bnx2x_init_one\n");
+		dev_err(&pdev->dev, "BAD net device from bnx2x_init_one\n");
 		return;
 	}
 	bp = netdev_priv(dev);
@@ -13313,7 +13358,7 @@ static int bnx2x_suspend(struct pci_dev *pdev, pm_message_t state)
 	struct bnx2x *bp;
 
 	if (!dev) {
-		pr_err("BAD net device from bnx2x_init_one\n");
+		dev_err(&pdev->dev, "BAD net device from bnx2x_init_one\n");
 		return -ENODEV;
 	}
 	bp = netdev_priv(dev);
@@ -13345,7 +13390,7 @@ static int bnx2x_resume(struct pci_dev *pdev)
 	int rc;
 
 	if (!dev) {
-		pr_err("BAD net device from bnx2x_init_one\n");
+		dev_err(&pdev->dev, "BAD net device from bnx2x_init_one\n");
 		return -ENODEV;
 	}
 	bp = netdev_priv(dev);
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 6/11] bnx2x: Added new statistics
From: Vladislav Zolotarov @ 2010-04-19 11:14 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, dmitry

Added total_mcast/bcast_pkts_transmitted statistics.

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |    4 +-
 drivers/net/bnx2x_main.c |  118 +++++++++++++++++++++++++++++-----------------
 2 files changed, 76 insertions(+), 46 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index 236235f..a9130f6 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -269,7 +269,7 @@ struct bnx2x_eth_q_stats {
 	u32 hw_csum_err;
 };
 
-#define BNX2X_NUM_Q_STATS		11
+#define BNX2X_NUM_Q_STATS		13
 #define Q_STATS_OFFSET32(stat_name) \
 			(offsetof(struct bnx2x_eth_q_stats, stat_name) / 4)
 
@@ -775,7 +775,7 @@ struct bnx2x_eth_stats {
 	u32 nig_timer_max;
 };
 
-#define BNX2X_NUM_STATS			41
+#define BNX2X_NUM_STATS			43
 #define STATS_OFFSET32(stat_name) \
 			(offsetof(struct bnx2x_eth_stats, stat_name) / 4)
 
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index a3b851f..25a8bbf 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -3568,7 +3568,6 @@ static void bnx2x_sp_task(struct work_struct *work)
 	struct bnx2x *bp = container_of(work, struct bnx2x, sp_task.work);
 	u16 status;
 
-
 	/* Return here if interrupt is disabled */
 	if (unlikely(atomic_read(&bp->intr_sem) != 0)) {
 		DP(NETIF_MSG_INTR, "called but intr_sem not 0, returning\n");
@@ -4425,6 +4424,21 @@ static int bnx2x_storm_stats_update(struct bnx2x *bp)
 		       qstats->total_bytes_received_lo,
 		       le32_to_cpu(tclient->rcv_unicast_bytes.lo));
 
+		SUB_64(qstats->total_bytes_received_hi,
+		       le32_to_cpu(uclient->bcast_no_buff_bytes.hi),
+		       qstats->total_bytes_received_lo,
+		       le32_to_cpu(uclient->bcast_no_buff_bytes.lo));
+
+		SUB_64(qstats->total_bytes_received_hi,
+		       le32_to_cpu(uclient->mcast_no_buff_bytes.hi),
+		       qstats->total_bytes_received_lo,
+		       le32_to_cpu(uclient->mcast_no_buff_bytes.lo));
+
+		SUB_64(qstats->total_bytes_received_hi,
+		       le32_to_cpu(uclient->ucast_no_buff_bytes.hi),
+		       qstats->total_bytes_received_lo,
+		       le32_to_cpu(uclient->ucast_no_buff_bytes.lo));
+
 		qstats->valid_bytes_received_hi =
 					qstats->total_bytes_received_hi;
 		qstats->valid_bytes_received_lo =
@@ -4673,47 +4687,43 @@ static void bnx2x_stats_update(struct bnx2x *bp)
 	bnx2x_drv_stats_update(bp);
 
 	if (netif_msg_timer(bp)) {
-		struct bnx2x_fastpath *fp0_rx = bp->fp;
-		struct bnx2x_fastpath *fp0_tx = bp->fp;
-		struct tstorm_per_client_stats *old_tclient =
-							&bp->fp->old_tclient;
-		struct bnx2x_eth_q_stats *qstats = &bp->fp->eth_q_stats;
 		struct bnx2x_eth_stats *estats = &bp->eth_stats;
-		struct net_device_stats *nstats = &bp->dev->stats;
 		int i;
 
-		netdev_printk(KERN_DEBUG, bp->dev, "\n");
-		printk(KERN_DEBUG "  tx avail (%4x)  tx hc idx (%x)"
-				  "  tx pkt (%lx)\n",
-		       bnx2x_tx_avail(fp0_tx),
-		       le16_to_cpu(*fp0_tx->tx_cons_sb), nstats->tx_packets);
-		printk(KERN_DEBUG "  rx usage (%4x)  rx hc idx (%x)"
-				  "  rx pkt (%lx)\n",
-		       (u16)(le16_to_cpu(*fp0_rx->rx_cons_sb) -
-			     fp0_rx->rx_comp_cons),
-		       le16_to_cpu(*fp0_rx->rx_cons_sb), nstats->rx_packets);
-		printk(KERN_DEBUG "  %s (Xoff events %u)  brb drops %u  "
-				  "brb truncate %u\n",
-		       (netif_queue_stopped(bp->dev) ? "Xoff" : "Xon"),
-		       qstats->driver_xoff,
+		printk(KERN_DEBUG "%s: brb drops %u  brb truncate %u\n",
+		       bp->dev->name,
 		       estats->brb_drop_lo, estats->brb_truncate_lo);
-		printk(KERN_DEBUG "tstats: checksum_discard %u  "
-			"packets_too_big_discard %lu  no_buff_discard %lu  "
-			"mac_discard %u  mac_filter_discard %u  "
-			"xxovrflow_discard %u  brb_truncate_discard %u  "
-			"ttl0_discard %u\n",
-		       le32_to_cpu(old_tclient->checksum_discard),
-		       bnx2x_hilo(&qstats->etherstatsoverrsizepkts_hi),
-		       bnx2x_hilo(&qstats->no_buff_discard_hi),
-		       estats->mac_discard, estats->mac_filter_discard,
-		       estats->xxoverflow_discard, estats->brb_truncate_discard,
-		       le32_to_cpu(old_tclient->ttl0_discard));
 
 		for_each_queue(bp, i) {
-			printk(KERN_DEBUG "[%d]: %lu\t%lu\t%lu\n", i,
-			       bnx2x_fp(bp, i, tx_pkt),
-			       bnx2x_fp(bp, i, rx_pkt),
-			       bnx2x_fp(bp, i, rx_calls));
+			struct bnx2x_fastpath *fp = &bp->fp[i];
+			struct bnx2x_eth_q_stats *qstats = &fp->eth_q_stats;
+
+			printk(KERN_DEBUG "%s: rx usage(%4u)  *rx_cons_sb(%u)"
+					  "  rx pkt(%lu)  rx calls(%lu %lu)\n",
+			       fp->name, (le16_to_cpu(*fp->rx_cons_sb) -
+			       fp->rx_comp_cons),
+			       le16_to_cpu(*fp->rx_cons_sb),
+			       bnx2x_hilo(&qstats->
+					  total_unicast_packets_received_hi),
+			       fp->rx_calls, fp->rx_pkt);
+		}
+
+		for_each_queue(bp, i) {
+			struct bnx2x_fastpath *fp = &bp->fp[i];
+			struct bnx2x_eth_q_stats *qstats = &fp->eth_q_stats;
+			struct netdev_queue *txq =
+				netdev_get_tx_queue(bp->dev, i);
+
+			printk(KERN_DEBUG "%s: tx avail(%4u)  *tx_cons_sb(%u)"
+					  "  tx pkt(%lu) tx calls (%lu)"
+					  "  %s (Xoff events %u)\n",
+			       fp->name, bnx2x_tx_avail(fp),
+			       le16_to_cpu(*fp->tx_cons_sb),
+			       bnx2x_hilo(&qstats->
+					  total_unicast_packets_transmitted_hi),
+			       fp->tx_pkt,
+			       (netif_tx_queue_stopped(txq) ? "Xoff" : "Xon"),
+			       qstats->driver_xoff);
 		}
 	}
 
@@ -11640,7 +11650,11 @@ static const struct {
 
 /* 10 */{ Q_STATS_OFFSET32(total_bytes_transmitted_hi),	8, "[%d]: tx_bytes" },
 	{ Q_STATS_OFFSET32(total_unicast_packets_transmitted_hi),
-							8, "[%d]: tx_packets" }
+						8, "[%d]: tx_ucast_packets" },
+	{ Q_STATS_OFFSET32(total_multicast_packets_transmitted_hi),
+						8, "[%d]: tx_mcast_packets" },
+	{ Q_STATS_OFFSET32(total_broadcast_packets_transmitted_hi),
+						8, "[%d]: tx_bcast_packets" }
 };
 
 static const struct {
@@ -11702,16 +11716,20 @@ static const struct {
 	{ STATS_OFFSET32(tx_stat_ifhcoutbadoctets_hi),
 				8, STATS_FLAGS_PORT, "tx_error_bytes" },
 	{ STATS_OFFSET32(total_unicast_packets_transmitted_hi),
-				8, STATS_FLAGS_BOTH, "tx_packets" },
+				8, STATS_FLAGS_BOTH, "tx_ucast_packets" },
+	{ STATS_OFFSET32(total_multicast_packets_transmitted_hi),
+				8, STATS_FLAGS_BOTH, "tx_mcast_packets" },
+	{ STATS_OFFSET32(total_broadcast_packets_transmitted_hi),
+				8, STATS_FLAGS_BOTH, "tx_bcast_packets" },
 	{ STATS_OFFSET32(tx_stat_dot3statsinternalmactransmiterrors_hi),
 				8, STATS_FLAGS_PORT, "tx_mac_errors" },
 	{ STATS_OFFSET32(rx_stat_dot3statscarriersenseerrors_hi),
 				8, STATS_FLAGS_PORT, "tx_carrier_errors" },
-	{ STATS_OFFSET32(tx_stat_dot3statssinglecollisionframes_hi),
+/* 30 */{ STATS_OFFSET32(tx_stat_dot3statssinglecollisionframes_hi),
 				8, STATS_FLAGS_PORT, "tx_single_collisions" },
 	{ STATS_OFFSET32(tx_stat_dot3statsmultiplecollisionframes_hi),
 				8, STATS_FLAGS_PORT, "tx_multi_collisions" },
-/* 30 */{ STATS_OFFSET32(tx_stat_dot3statsdeferredtransmissions_hi),
+	{ STATS_OFFSET32(tx_stat_dot3statsdeferredtransmissions_hi),
 				8, STATS_FLAGS_PORT, "tx_deferred" },
 	{ STATS_OFFSET32(tx_stat_dot3statsexcessivecollisions_hi),
 				8, STATS_FLAGS_PORT, "tx_excess_collisions" },
@@ -11727,11 +11745,11 @@ static const struct {
 			8, STATS_FLAGS_PORT, "tx_128_to_255_byte_packets" },
 	{ STATS_OFFSET32(tx_stat_etherstatspkts256octetsto511octets_hi),
 			8, STATS_FLAGS_PORT, "tx_256_to_511_byte_packets" },
-	{ STATS_OFFSET32(tx_stat_etherstatspkts512octetsto1023octets_hi),
+/* 40 */{ STATS_OFFSET32(tx_stat_etherstatspkts512octetsto1023octets_hi),
 			8, STATS_FLAGS_PORT, "tx_512_to_1023_byte_packets" },
 	{ STATS_OFFSET32(etherstatspkts1024octetsto1522octets_hi),
 			8, STATS_FLAGS_PORT, "tx_1024_to_1522_byte_packets" },
-/* 40 */{ STATS_OFFSET32(etherstatspktsover1522octets_hi),
+	{ STATS_OFFSET32(etherstatspktsover1522octets_hi),
 			8, STATS_FLAGS_PORT, "tx_1523_to_9022_byte_packets" },
 	{ STATS_OFFSET32(pause_frames_sent_hi),
 				8, STATS_FLAGS_PORT, "tx_pause_frames" }
@@ -12266,6 +12284,8 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	int i;
 	u8 hlen = 0;
 	__le16 pkt_size = 0;
+	struct ethhdr *eth;
+	u8 mac_type = UNICAST_ADDRESS;
 
 #ifdef BNX2X_STOP_ON_ERROR
 	if (unlikely(bp->panic))
@@ -12289,6 +12309,16 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	   skb->ip_summed, skb->protocol, ipv6_hdr(skb)->nexthdr,
 	   ip_hdr(skb)->protocol, skb_shinfo(skb)->gso_type, xmit_type);
 
+	eth = (struct ethhdr *)skb->data;
+
+	/* set flag according to packet type (UNICAST_ADDRESS is default)*/
+	if (unlikely(is_multicast_ether_addr(eth->h_dest))) {
+		if (is_broadcast_ether_addr(eth->h_dest))
+			mac_type = BROADCAST_ADDRESS;
+		else
+			mac_type = MULTICAST_ADDRESS;
+	}
+
 #if (MAX_SKB_FRAGS >= MAX_FETCH_BD - 3)
 	/* First, check if we need to linearize the skb (due to FW
 	   restrictions). No need to check fragmentation if page size > 8K
@@ -12322,8 +12352,8 @@ static netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	tx_start_bd = &fp->tx_desc_ring[bd_prod].start_bd;
 
 	tx_start_bd->bd_flags.as_bitfield = ETH_TX_BD_FLAGS_START_BD;
-	tx_start_bd->general_data = (UNICAST_ADDRESS <<
-				     ETH_TX_START_BD_ETH_ADDR_TYPE_SHIFT);
+	tx_start_bd->general_data =  (mac_type <<
+					ETH_TX_START_BD_ETH_ADDR_TYPE_SHIFT);
 	/* header nbd */
 	tx_start_bd->general_data |= (1 << ETH_TX_START_BD_HDR_NBDS_SHIFT);
 
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 7/11] bnx2x: Fixed MSI-X enabling flow
From: Vladislav Zolotarov @ 2010-04-19 11:14 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, dmitry

Try to enable less MSI-X vectors if initial request has failed.

Author: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x.h      |   18 +++++++++++++-----
 drivers/net/bnx2x_main.c |   28 +++++++++++++++++++++++++---
 2 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
index a9130f6..7abb2de 100644
--- a/drivers/net/bnx2x.h
+++ b/drivers/net/bnx2x.h
@@ -24,16 +24,25 @@
 #define BCM_VLAN			1
 #endif
 
+#define BNX2X_MULTI_QUEUE
+
+#define BNX2X_NEW_NAPI
+
+
+
 #if defined(CONFIG_CNIC) || defined(CONFIG_CNIC_MODULE)
 #define BCM_CNIC 1
 #include "cnic_if.h"
 #endif
 
-#define BNX2X_MULTI_QUEUE
-
-#define BNX2X_NEW_NAPI
-
 
+#ifdef BCM_CNIC
+#define BNX2X_MIN_MSIX_VEC_CNT 3
+#define BNX2X_MSIX_VEC_FP_START 2
+#else
+#define BNX2X_MIN_MSIX_VEC_CNT 2
+#define BNX2X_MSIX_VEC_FP_START 1
+#endif
 
 #include <linux/mdio.h>
 #include "bnx2x_reg.h"
@@ -859,7 +868,6 @@ struct bnx2x {
 #endif
 #define INT_MODE_INTx			1
 #define INT_MODE_MSI			2
-#define INT_MODE_MSIX			3
 
 	int			tx_ring_size;
 
diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 25a8bbf..484ff2b 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -7430,7 +7430,31 @@ static int bnx2x_enable_msix(struct bnx2x *bp)
 
 	rc = pci_enable_msix(bp->pdev, &bp->msix_table[0],
 			     BNX2X_NUM_QUEUES(bp) + offset);
-	if (rc) {
+
+	/*
+	 * reconfigure number of tx/rx queues according to available
+	 * MSI-X vectors
+	 */
+	if (rc >= BNX2X_MIN_MSIX_VEC_CNT) {
+		/* vectors available for FP */
+		int fp_vec = rc - BNX2X_MSIX_VEC_FP_START;
+
+		DP(NETIF_MSG_IFUP,
+		   "Trying to use less MSI-X vectors: %d\n", rc);
+
+		rc = pci_enable_msix(bp->pdev, &bp->msix_table[0], rc);
+
+		if (rc) {
+			DP(NETIF_MSG_IFUP,
+			   "MSI-X is not attainable  rc %d\n", rc);
+			return rc;
+		}
+
+		bp->num_queues = min(bp->num_queues, fp_vec);
+
+		DP(NETIF_MSG_IFUP, "New queue configuration set: %d\n",
+				  bp->num_queues);
+	} else if (rc) {
 		DP(NETIF_MSG_IFUP, "MSI-X is not attainable  rc %d\n", rc);
 		return rc;
 	}
@@ -7853,8 +7877,6 @@ static int bnx2x_set_num_queues(struct bnx2x *bp)
 		bp->num_queues = 1;
 		DP(NETIF_MSG_IFUP, "set number of queues to 1\n");
 		break;
-
-	case INT_MODE_MSIX:
 	default:
 		/* Set number of queues according to bp->multi_mode value */
 		bnx2x_set_num_queues_msix(bp);
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 8/11] bnx2x: use mask in test_registers() to avoid parity error
From: Vladislav Zolotarov @ 2010-04-19 11:14 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Properly mask the value to be written to the register (according to the register size) during the self-test.
Otherwise immediate parity error would be generated.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index 484ff2b..d311476 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -11219,7 +11219,7 @@ static int bnx2x_test_registers(struct bnx2x *bp)
 
 			save_val = REG_RD(bp, offset);
 
-			REG_WR(bp, offset, wr_val);
+			REG_WR(bp, offset, (wr_val & mask));
 			val = REG_RD(bp, offset);
 
 			/* Restore the original register's value */
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 9/11] bnx2x: Rework power state handling code
From: Vladislav Zolotarov @ 2010-04-19 11:14 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Move "don't shut down the power" logic into bnx2x_set_power_state() to make the code cleaner.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index d311476..2eb9a3b 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -12017,6 +12017,14 @@ static int bnx2x_set_power_state(struct bnx2x *bp, pci_power_t state)
 		break;
 
 	case PCI_D3hot:
+		/* If there are other clients above don't
+		   shut down the power */
+		if (atomic_read(&bp->pdev->enable_cnt) != 1)
+			return 0;
+		/* Don't shut down the power for emulation and FPGA */
+		if (CHIP_REV_IS_SLOW(bp))
+			return 0;
+
 		pmcsr &= ~PCI_PM_CTRL_STATE_MASK;
 		pmcsr |= 3;
 
@@ -12629,9 +12637,7 @@ static int bnx2x_close(struct net_device *dev)
 
 	/* Unload the driver, release IRQs */
 	bnx2x_nic_unload(bp, UNLOAD_CLOSE);
-	if (atomic_read(&bp->pdev->enable_cnt) == 1)
-		if (!CHIP_REV_IS_SLOW(bp))
-			bnx2x_set_power_state(bp, PCI_D3hot);
+	bnx2x_set_power_state(bp, PCI_D3hot);
 
 	return 0;
 }
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 10/11] bnx2x: Don't report link down if has been already down
From: Vladislav Zolotarov @ 2010-04-19 11:15 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list, yanivr

Author: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index ab9a9eb..c4aac38 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -2469,6 +2469,7 @@ static void bnx2x_init_vn_minmax(struct bnx2x *bp, int func)
 /* This function is called upon link interrupt */
 static void bnx2x_link_attn(struct bnx2x *bp)
 {
+	u32 prev_link_status = bp->link_vars.link_status;
 	/* Make sure that we are synced with the current statistics */
 	bnx2x_stats_handle(bp, STATS_EVENT_STOP);
 
@@ -2501,8 +2502,9 @@ static void bnx2x_link_attn(struct bnx2x *bp)
 			bnx2x_stats_handle(bp, STATS_EVENT_LINK_UP);
 	}
 
-	/* indicate link status */
-	bnx2x_link_report(bp);
+	/* indicate link status only if link status actually changed */
+	if (prev_link_status != bp->link_vars.link_status)
+		bnx2x_link_report(bp);
 
 	if (IS_E1HMF(bp)) {
 		int port = BP_PORT(bp);
-- 
1.6.3.3





^ permalink raw reply related

* [PATCH net-next-2.6 11/11] bnx2x: Date and version
From: Vladislav Zolotarov @ 2010-04-19 11:15 UTC (permalink / raw)
  To: Dave Miller; +Cc: Eilon Greenstein, netdev list

Set version to 1.52.53-1.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/bnx2x_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
index a0e01bc..42d5af4 100644
--- a/drivers/net/bnx2x_main.c
+++ b/drivers/net/bnx2x_main.c
@@ -61,8 +61,8 @@
 #include "bnx2x_init_ops.h"
 #include "bnx2x_dump.h"
 
-#define DRV_MODULE_VERSION	"1.52.1-8"
-#define DRV_MODULE_RELDATE	"2010/04/01"
+#define DRV_MODULE_VERSION	"1.52.53-1"
+#define DRV_MODULE_RELDATE	"2010/18/04"
 #define BNX2X_BC_VER		0x040200
 
 #include <linux/firmware.h>
-- 
1.7.0.4





^ permalink raw reply related

* [PATCH v3 0/3] e1000e,igb,ixgbe: add registers etc. printout code just before resetting adapters
From: Taku Izumi @ 2010-04-19 11:20 UTC (permalink / raw)
  To: Bruce Allan, David S. Miller, Jesse Brandeburg, John Ronciak,
	"Kirsher, Jeffre
  Cc: Kenji Kaneshige, chavey

Hi Jeff,

This patchset is the update version of "register etc. printout code" patch.
The old ones are
 http://kerneltrap.org/mailarchive/linux-netdev/2010/1/7/6265865
 http://kerneltrap.org/mailarchive/linux-netdev/2010/1/22/6267176

v2 -> v3:
 - avoid modifying headers
 - delete "dump_flag" module parameter and use msglvl instead.

 To enable the printout, it is necessary to specify msglvl like the following.
 #	 ethtool -s eth0 msglvl 0x2003

Best regards,
Taku Izumi


^ permalink raw reply

* [PATCH v3 1/3] e1000e: add registers etc. printout code just before resetting adapters
From: Taku Izumi @ 2010-04-19 11:25 UTC (permalink / raw)
  To: Bruce Allan, David S. Miller, Jesse Brandeburg, John Ronciak,
	"Kirsher, Jeffre
  Cc: Kenji Kaneshige, chavey
In-Reply-To: <4BCC3C9B.3000901@jp.fujitsu.com>

This patch adds registers (,tx/rx rings' status and so on) printout
code just before resetting adapters. This will be helpful for detecting
the root cause of adapters reset.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
---
 drivers/net/e1000e/netdev.c |  357 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 357 insertions(+)
Index: net-next-2.6.34/drivers/net/e1000e/netdev.c
===================================================================
--- net-next-2.6.34.orig/drivers/net/e1000e/netdev.c
+++ net-next-2.6.34/drivers/net/e1000e/netdev.c
@@ -69,6 +69,361 @@ static const struct e1000_info *e1000_in
 	[board_pchlan]		= &e1000_pch_info,
 };

+struct e1000_reg_info {
+	u32 ofs;
+	char *name;
+};
+
+#define E1000_RDFH	0x02410 /* Rx Data FIFO Head - RW */
+#define E1000_RDFT	0x02418 /* Rx Data FIFO Tail - RW */
+#define E1000_RDFHS	0x02420 /* Rx Data FIFO Head Saved - RW */
+#define E1000_RDFTS	0x02428 /* Rx Data FIFO Tail Saved - RW */
+#define E1000_RDFPC	0x02430 /* Rx Data FIFO Packet Count - RW */
+
+#define E1000_TDFH	0x03410 /* Tx Data FIFO Head - RW */
+#define E1000_TDFT	0x03418 /* Tx Data FIFO Tail - RW */
+#define E1000_TDFHS	0x03420 /* Tx Data FIFO Head Saved - RW */
+#define E1000_TDFTS	0x03428 /* Tx Data FIFO Tail Saved - RW */
+#define E1000_TDFPC	0x03430 /* Tx Data FIFO Packet Count - RW */
+
+static const struct e1000_reg_info e1000_reg_info_tbl[] = {
+
+	/* General Registers */
+	{E1000_CTRL, "CTRL"},
+	{E1000_STATUS, "STATUS"},
+	{E1000_CTRL_EXT, "CTRL_EXT"},
+
+	/* Interrupt Registers */
+	{E1000_ICR, "ICR"},
+
+	/* RX Registers */
+	{E1000_RCTL, "RCTL"},
+	{E1000_RDLEN, "RDLEN"},
+	{E1000_RDH, "RDH"},
+	{E1000_RDT, "RDT"},
+	{E1000_RDTR, "RDTR"},
+	{E1000_RXDCTL(0), "RXDCTL"},
+	{E1000_ERT, "ERT"},
+	{E1000_RDBAL, "RDBAL"},
+	{E1000_RDBAH, "RDBAH"},
+	{E1000_RDFH, "RDFH"},
+	{E1000_RDFT, "RDFT"},
+	{E1000_RDFHS, "RDFHS"},
+	{E1000_RDFTS, "RDFTS"},
+	{E1000_RDFPC, "RDFPC"},
+
+	/* TX Registers */
+	{E1000_TCTL, "TCTL"},
+	{E1000_TDBAL, "TDBAL"},
+	{E1000_TDBAH, "TDBAH"},
+	{E1000_TDLEN, "TDLEN"},
+	{E1000_TDH, "TDH"},
+	{E1000_TDT, "TDT"},
+	{E1000_TIDV, "TIDV"},
+	{E1000_TXDCTL(0), "TXDCTL"},
+	{E1000_TADV, "TADV"},
+	{E1000_TARC(0), "TARC"},
+	{E1000_TDFH, "TDFH"},
+	{E1000_TDFT, "TDFT"},
+	{E1000_TDFHS, "TDFHS"},
+	{E1000_TDFTS, "TDFTS"},
+	{E1000_TDFPC, "TDFPC"},
+
+	/* List Terminator */
+	{}
+};
+
+/*
+ * e1000_regdump - register printout routine
+ */
+static void e1000_regdump(struct e1000_hw *hw, struct e1000_reg_info *reginfo)
+{
+	int n = 0;
+	char rname[16];
+	u32 regs[8];
+
+	switch (reginfo->ofs) {
+	case E1000_RXDCTL(0):
+		for (n = 0; n < 2; n++)
+			regs[n] = __er32(hw, E1000_RXDCTL(n));
+		break;
+	case E1000_TXDCTL(0):
+		for (n = 0; n < 2; n++)
+			regs[n] = __er32(hw, E1000_TXDCTL(n));
+		break;
+	case E1000_TARC(0):
+		for (n = 0; n < 2; n++)
+			regs[n] = __er32(hw, E1000_TARC(n));
+		break;
+	default:
+		printk(KERN_INFO "%-15s %08x\n",
+			reginfo->name, __er32(hw, reginfo->ofs));
+		return;
+	}
+
+	snprintf(rname, 16, "%s%s", reginfo->name, "[0-1]");
+	printk(KERN_INFO "%-15s ", rname);
+	for (n = 0; n < 2; n++)
+		printk(KERN_CONT "%08x ", regs[n]);
+	printk(KERN_CONT "\n");
+}
+
+
+/*
+ * e1000e_dump - Print registers, tx-ring and rx-ring
+ */
+static void e1000e_dump(struct e1000_adapter *adapter)
+{
+	struct net_device *netdev = adapter->netdev;
+	struct e1000_hw *hw = &adapter->hw;
+	struct e1000_reg_info *reginfo;
+	struct e1000_ring *tx_ring = adapter->tx_ring;
+	struct e1000_tx_desc *tx_desc;
+	struct my_u0 { u64 a; u64 b; } *u0;
+	struct e1000_buffer *buffer_info;
+	struct e1000_ring *rx_ring = adapter->rx_ring;
+	union e1000_rx_desc_packet_split *rx_desc_ps;
+	struct e1000_rx_desc *rx_desc;
+	struct my_u1 { u64 a; u64 b; u64 c; u64 d; } *u1;
+	u32 staterr;
+	int i = 0;
+
+	if (!netif_msg_hw(adapter))
+		return;
+
+	/* Print netdevice Info */
+	if (netdev) {
+		dev_info(&adapter->pdev->dev, "Net device Info\n");
+		printk(KERN_INFO "Device Name     state            "
+			"trans_start      last_rx\n");
+		printk(KERN_INFO "%-15s %016lX %016lX %016lX\n",
+			netdev->name,
+			netdev->state,
+			netdev->trans_start,
+			netdev->last_rx);
+	}
+
+	/* Print Registers */
+	dev_info(&adapter->pdev->dev, "Register Dump\n");
+	printk(KERN_INFO " Register Name   Value\n");
+	for (reginfo = (struct e1000_reg_info *)e1000_reg_info_tbl;
+	     reginfo->name; reginfo++) {
+		e1000_regdump(hw, reginfo);
+	}
+
+	/* Print TX Ring Summary */
+	if (!netdev || !netif_running(netdev))
+		goto exit;
+
+	dev_info(&adapter->pdev->dev, "TX Rings Summary\n");
+	printk(KERN_INFO "Queue [NTU] [NTC] [bi(ntc)->dma  ]"
+		" leng ntw timestamp\n");
+	buffer_info = &tx_ring->buffer_info[tx_ring->next_to_clean];
+	printk(KERN_INFO " %5d %5X %5X %016llX %04X %3X %016llX\n",
+		0, tx_ring->next_to_use, tx_ring->next_to_clean,
+		(u64)buffer_info->dma,
+		buffer_info->length,
+		buffer_info->next_to_watch,
+		(u64)buffer_info->time_stamp);
+
+	/* Print TX Rings */
+	if (!netif_msg_tx_done(adapter))
+		goto rx_ring_summary;
+
+	dev_info(&adapter->pdev->dev, "TX Rings Dump\n");
+
+	/* Transmit Descriptor Formats - DEXT[29] is 0 (Legacy) or 1 (Extended)
+	 *
+	 * Legacy Transmit Descriptor
+	 *   +--------------------------------------------------------------+
+	 * 0 |         Buffer Address [63:0] (Reserved on Write Back)       |
+	 *   +--------------------------------------------------------------+
+	 * 8 | Special  |    CSS     | Status |  CMD    |  CSO   |  Length  |
+	 *   +--------------------------------------------------------------+
+	 *   63       48 47        36 35    32 31     24 23    16 15        0
+	 *
+	 * Extended Context Descriptor (DTYP=0x0) for TSO or checksum offload
+	 *   63      48 47    40 39       32 31             16 15    8 7      0
+	 *   +----------------------------------------------------------------+
+	 * 0 |  TUCSE  | TUCS0  |   TUCSS   |     IPCSE       | IPCS0 | IPCSS |
+	 *   +----------------------------------------------------------------+
+	 * 8 |   MSS   | HDRLEN | RSV | STA | TUCMD | DTYP |      PAYLEN      |
+	 *   +----------------------------------------------------------------+
+	 *   63      48 47    40 39 36 35 32 31   24 23  20 19                0
+	 *
+	 * Extended Data Descriptor (DTYP=0x1)
+	 *   +----------------------------------------------------------------+
+	 * 0 |                     Buffer Address [63:0]                      |
+	 *   +----------------------------------------------------------------+
+	 * 8 | VLAN tag |  POPTS  | Rsvd | Status | Command | DTYP |  DTALEN  |
+	 *   +----------------------------------------------------------------+
+	 *   63       48 47     40 39  36 35    32 31     24 23  20 19        0
+	 */
+	printk(KERN_INFO "Tl[desc]     [address 63:0  ] [SpeCssSCmCsLen]"
+		" [bi->dma       ] leng  ntw timestamp        bi->skb "
+		"<-- Legacy format\n");
+	printk(KERN_INFO "Tc[desc]     [Ce CoCsIpceCoS] [MssHlRSCm0Plen]"
+		" [bi->dma       ] leng  ntw timestamp        bi->skb "
+		"<-- Ext Context format\n");
+	printk(KERN_INFO "Td[desc]     [address 63:0  ] [VlaPoRSCm1Dlen]"
+		" [bi->dma       ] leng  ntw timestamp        bi->skb "
+		"<-- Ext Data format\n");
+	for (i = 0; tx_ring->desc && (i < tx_ring->count); i++) {
+		tx_desc = E1000_TX_DESC(*tx_ring, i);
+		buffer_info = &tx_ring->buffer_info[i];
+		u0 = (struct my_u0 *)tx_desc;
+		printk(KERN_INFO "T%c[0x%03X]    %016llX %016llX %016llX "
+			"%04X  %3X %016llX %p",
+		       (!(le64_to_cpu(u0->b) & (1<<29)) ? 'l' :
+			((le64_to_cpu(u0->b) & (1<<20)) ? 'd' : 'c')), i,
+		       le64_to_cpu(u0->a), le64_to_cpu(u0->b),
+		       (u64)buffer_info->dma, buffer_info->length,
+		       buffer_info->next_to_watch, (u64)buffer_info->time_stamp,
+		       buffer_info->skb);
+		if (i == tx_ring->next_to_use && i == tx_ring->next_to_clean)
+			printk(KERN_CONT " NTC/U\n");
+		else if (i == tx_ring->next_to_use)
+			printk(KERN_CONT " NTU\n");
+		else if (i == tx_ring->next_to_clean)
+			printk(KERN_CONT " NTC\n");
+		else
+			printk(KERN_CONT "\n");
+
+		if (netif_msg_pktdata(adapter) && buffer_info->dma != 0)
+			print_hex_dump(KERN_INFO, "", DUMP_PREFIX_ADDRESS,
+					16, 1, phys_to_virt(buffer_info->dma),
+					buffer_info->length, true);
+	}
+
+	/* Print RX Rings Summary */
+rx_ring_summary:
+	dev_info(&adapter->pdev->dev, "RX Rings Summary\n");
+	printk(KERN_INFO "Queue [NTU] [NTC]\n");
+	printk(KERN_INFO " %5d %5X %5X\n", 0,
+		rx_ring->next_to_use, rx_ring->next_to_clean);
+
+	/* Print RX Rings */
+	if (!netif_msg_rx_status(adapter))
+		goto exit;
+
+	dev_info(&adapter->pdev->dev, "RX Rings Dump\n");
+	switch (adapter->rx_ps_pages) {
+	case 1:
+	case 2:
+	case 3:
+		/* [Extended] Packet Split Receive Descriptor Format
+		 *
+		 *    +-----------------------------------------------------+
+		 *  0 |                Buffer Address 0 [63:0]              |
+		 *    +-----------------------------------------------------+
+		 *  8 |                Buffer Address 1 [63:0]              |
+		 *    +-----------------------------------------------------+
+		 * 16 |                Buffer Address 2 [63:0]              |
+		 *    +-----------------------------------------------------+
+		 * 24 |                Buffer Address 3 [63:0]              |
+		 *    +-----------------------------------------------------+
+		 */
+		printk(KERN_INFO "R  [desc]      [buffer 0 63:0 ] "
+			"[buffer 1 63:0 ] "
+		       "[buffer 2 63:0 ] [buffer 3 63:0 ] [bi->dma       ] "
+		       "[bi->skb] <-- Ext Pkt Split format\n");
+		/* [Extended] Receive Descriptor (Write-Back) Format
+		 *
+		 *   63       48 47    32 31     13 12    8 7    4 3        0
+		 *   +------------------------------------------------------+
+		 * 0 | Packet   | IP     |  Rsvd   | MRQ   | Rsvd | MRQ RSS |
+		 *   | Checksum | Ident  |         | Queue |      |  Type   |
+		 *   +------------------------------------------------------+
+		 * 8 | VLAN Tag | Length | Extended Error | Extended Status |
+		 *   +------------------------------------------------------+
+		 *   63       48 47    32 31            20 19               0
+		 */
+		printk(KERN_INFO "RWB[desc]      [ck ipid mrqhsh] "
+			"[vl   l0 ee  es] "
+		       "[ l3  l2  l1 hs] [reserved      ] ---------------- "
+		       "[bi->skb] <-- Ext Rx Write-Back format\n");
+		for (i = 0; i < rx_ring->count; i++) {
+			buffer_info = &rx_ring->buffer_info[i];
+			rx_desc_ps = E1000_RX_DESC_PS(*rx_ring, i);
+			u1 = (struct my_u1 *)rx_desc_ps;
+			staterr =
+				le32_to_cpu(rx_desc_ps->wb.middle.status_error);
+			if (staterr & E1000_RXD_STAT_DD) {
+				/* Descriptor Done */
+				printk(KERN_INFO "RWB[0x%03X]     %016llX "
+					"%016llX %016llX %016llX "
+					"---------------- %p", i,
+					le64_to_cpu(u1->a),
+					le64_to_cpu(u1->b),
+					le64_to_cpu(u1->c),
+					le64_to_cpu(u1->d),
+					buffer_info->skb);
+			} else {
+				printk(KERN_INFO "R  [0x%03X]     %016llX "
+					"%016llX %016llX %016llX %016llX %p", i,
+					le64_to_cpu(u1->a),
+					le64_to_cpu(u1->b),
+					le64_to_cpu(u1->c),
+					le64_to_cpu(u1->d),
+					(u64)buffer_info->dma,
+					buffer_info->skb);
+
+				if (netif_msg_pktdata(adapter))
+					print_hex_dump(KERN_INFO, "",
+						DUMP_PREFIX_ADDRESS, 16, 1,
+						phys_to_virt(buffer_info->dma),
+						adapter->rx_ps_bsize0, true);
+			}
+
+			if (i == rx_ring->next_to_use)
+				printk(KERN_CONT " NTU\n");
+			else if (i == rx_ring->next_to_clean)
+				printk(KERN_CONT " NTC\n");
+			else
+				printk(KERN_CONT "\n");
+		}
+		break;
+	default:
+	case 0:
+		/* Legacy Receive Descriptor Format
+		 *
+		 * +-----------------------------------------------------+
+		 * |                Buffer Address [63:0]                |
+		 * +-----------------------------------------------------+
+		 * | VLAN Tag | Errors | Status 0 | Packet csum | Length |
+		 * +-----------------------------------------------------+
+		 * 63       48 47    40 39      32 31         16 15      0
+		 */
+		printk(KERN_INFO "Rl[desc]     [address 63:0  ] "
+			"[vl er S cks ln] [bi->dma       ] [bi->skb] "
+			"<-- Legacy format\n");
+		for (i = 0; rx_ring->desc && (i < rx_ring->count); i++) {
+			rx_desc = E1000_RX_DESC(*rx_ring, i);
+			buffer_info = &rx_ring->buffer_info[i];
+			u0 = (struct my_u0 *)rx_desc;
+			printk(KERN_INFO "Rl[0x%03X]    %016llX %016llX "
+				"%016llX %p",
+				i, le64_to_cpu(u0->a), le64_to_cpu(u0->b),
+				(u64)buffer_info->dma, buffer_info->skb);
+			if (i == rx_ring->next_to_use)
+				printk(KERN_CONT " NTU\n");
+			else if (i == rx_ring->next_to_clean)
+				printk(KERN_CONT " NTC\n");
+			else
+				printk(KERN_CONT "\n");
+
+			if (netif_msg_pktdata(adapter))
+				print_hex_dump(KERN_INFO, "",
+					DUMP_PREFIX_ADDRESS,
+					16, 1, phys_to_virt(buffer_info->dma),
+					adapter->rx_buffer_len, true);
+		}
+	}
+
+exit:
+	return;
+}
+
 /**
  * e1000_desc_unused - calculate if we have unused descriptors
  **/
@@ -4268,6 +4623,8 @@ static void e1000_reset_task(struct work
 	struct e1000_adapter *adapter;
 	adapter = container_of(work, struct e1000_adapter, reset_task);

+	e1000e_dump(adapter);
+	e_err("Reset adapter\n");
 	e1000e_reinit_locked(adapter);
 }




^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox