Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list
From: Eric Dumazet @ 2011-07-26  9:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Tim Chen, Al Viro, David Miller, Andi Kleen, Matthew Wilcox,
	Anton Blanchard, npiggin, linux-kernel, linux-fsdevel, netdev
In-Reply-To: <20110726090357.GA13013@infradead.org>

Le mardi 26 juillet 2011 à 05:03 -0400, Christoph Hellwig a écrit :
> On Tue, Jul 26, 2011 at 10:21:06AM +0200, Eric Dumazet wrote:
> > Well, not 'last' contention point, as we still hit remove_inode_hash(),
> 
> There should be no ned to put pipe or anon inodes on the inode hash.
> Probably sockets don't need it either, but I'd need to look at it in
> detail.
> 
> > inode_wb_list_del()
> 
> The should never be on the wb list either, doing an unlocked check for
> actually beeing on the list before taking the lock should help you.

Yes, it might even help regular inodes ;)

> 
> > inode_lru_list_del(),
> 
> No real need to keep inodes in the LRU if we only allocate them using
> new_inode but never look them up either.  You might want to try setting
> .drop_inode to generic_delete_inode for these.

Yes, I'll take a look, thanks.

> 
> > +struct inode *__new_inode(struct super_block *sb)
> > +{
> > +	struct inode *inode = alloc_inode(sb);
> > +
> > +	if (inode) {
> > +		spin_lock(&inode->i_lock);
> > +		inode->i_state = 0;
> > +		spin_unlock(&inode->i_lock);
> > +		INIT_LIST_HEAD(&inode->i_sb_list);
> > +	}
> > +	return inode;
> > +}
> 
> This needs a much better name like new_inode_pseudo, and a kerneldoc 
> comment explaining when it is safe to use, and the consequences, which
> appear to me:
> 
>  - fs may never be unmount
>  - quotas can't work on the filesystem
>  - writeback can't work on the filesystem

Thanks for reviewing, here is v2 of the patch, addressing your comments.


[PATCH v2] vfs: dont chain pipe/anon/socket on superblock s_inodes list

Workloads using pipes and sockets hit inode_sb_list_lock contention.

superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
v2: address Christoph comments

 fs/anon_inodes.c   |    2 +-
 fs/inode.c         |   39 ++++++++++++++++++++++++++++++---------
 fs/pipe.c          |    2 +-
 include/linux/fs.h |    3 ++-
 net/socket.c       |    2 +-
 5 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 4d433d3..f11e43e 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -187,7 +187,7 @@ EXPORT_SYMBOL_GPL(anon_inode_getfd);
  */
 static struct inode *anon_inode_mkinode(void)
 {
-	struct inode *inode = new_inode(anon_inode_mnt->mnt_sb);
+	struct inode *inode = new_inode_pseudo(anon_inode_mnt->mnt_sb);
 
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
diff --git a/fs/inode.c b/fs/inode.c
index 96c77b8..319b93b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -362,9 +362,11 @@ EXPORT_SYMBOL_GPL(inode_sb_list_add);
 
 static inline void inode_sb_list_del(struct inode *inode)
 {
-	spin_lock(&inode_sb_list_lock);
-	list_del_init(&inode->i_sb_list);
-	spin_unlock(&inode_sb_list_lock);
+	if (!list_empty(&inode->i_sb_list)) {
+		spin_lock(&inode_sb_list_lock);
+		list_del_init(&inode->i_sb_list);
+		spin_unlock(&inode_sb_list_lock);
+	}
 }
 
 static unsigned long hash(struct super_block *sb, unsigned long hashval)
@@ -797,6 +799,29 @@ unsigned int get_next_ino(void)
 EXPORT_SYMBOL(get_next_ino);
 
 /**
+ *	new_inode_pseudo 	- obtain an inode
+ *	@sb: superblock
+ *
+ *	Allocates a new inode for given superblock.
+ *	Inode wont be chained in superblock s_inodes list
+ *	This means :
+ *	- fs can't be unmount
+ *	- quotas, fsnotify, writeback can't work
+ */
+struct inode *new_inode_pseudo(struct super_block *sb)
+{
+	struct inode *inode = alloc_inode(sb);
+
+	if (inode) {
+		spin_lock(&inode->i_lock);
+		inode->i_state = 0;
+		spin_unlock(&inode->i_lock);
+		INIT_LIST_HEAD(&inode->i_sb_list);
+	}
+	return inode;
+}
+
+/**
  *	new_inode 	- obtain an inode
  *	@sb: superblock
  *
@@ -814,13 +839,9 @@ struct inode *new_inode(struct super_block *sb)
 
 	spin_lock_prefetch(&inode_sb_list_lock);
 
-	inode = alloc_inode(sb);
-	if (inode) {
-		spin_lock(&inode->i_lock);
-		inode->i_state = 0;
-		spin_unlock(&inode->i_lock);
+	inode = new_inode_pseudo(sb);
+	if (inode)
 		inode_sb_list_add(inode);
-	}
 	return inode;
 }
 EXPORT_SYMBOL(new_inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index 1b7f9af..0e0be1d 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -948,7 +948,7 @@ static const struct dentry_operations pipefs_dentry_operations = {
 
 static struct inode * get_pipe_inode(void)
 {
-	struct inode *inode = new_inode(pipe_mnt->mnt_sb);
+	struct inode *inode = new_inode_pseudo(pipe_mnt->mnt_sb);
 	struct pipe_inode_info *pipe;
 
 	if (!inode)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a665804..cc363fa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2310,7 +2310,8 @@ extern void __iget(struct inode * inode);
 extern void iget_failed(struct inode *);
 extern void end_writeback(struct inode *);
 extern void __destroy_inode(struct inode *);
-extern struct inode *new_inode(struct super_block *);
+extern struct inode *new_inode_pseudo(struct super_block *sb);
+extern struct inode *new_inode(struct super_block *sb);
 extern void free_inode_nonrcu(struct inode *inode);
 extern int should_remove_suid(struct dentry *);
 extern int file_remove_suid(struct file *);
diff --git a/net/socket.c b/net/socket.c
index 02dc82d..26ed35c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -467,7 +467,7 @@ static struct socket *sock_alloc(void)
 	struct inode *inode;
 	struct socket *sock;
 
-	inode = new_inode(sock_mnt->mnt_sb);
+	inode = new_inode_pseudo(sock_mnt->mnt_sb);
 	if (!inode)
 		return NULL;
 



^ permalink raw reply related

* Re: [PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list
From: Christoph Hellwig @ 2011-07-26  9:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tim Chen, Al Viro, David Miller, Christoph Hellwig, Andi Kleen,
	Matthew Wilcox, Anton Blanchard, npiggin, linux-kernel,
	linux-fsdevel, netdev
In-Reply-To: <1311668466.2355.12.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

On Tue, Jul 26, 2011 at 10:21:06AM +0200, Eric Dumazet wrote:
> Well, not 'last' contention point, as we still hit remove_inode_hash(),

There should be no ned to put pipe or anon inodes on the inode hash.
Probably sockets don't need it either, but I'd need to look at it in
detail.

> inode_wb_list_del()

The should never be on the wb list either, doing an unlocked check for
actually beeing on the list before taking the lock should help you.

> inode_lru_list_del(),

No real need to keep inodes in the LRU if we only allocate them using
new_inode but never look them up either.  You might want to try setting
.drop_inode to generic_delete_inode for these.

> +struct inode *__new_inode(struct super_block *sb)
> +{
> +	struct inode *inode = alloc_inode(sb);
> +
> +	if (inode) {
> +		spin_lock(&inode->i_lock);
> +		inode->i_state = 0;
> +		spin_unlock(&inode->i_lock);
> +		INIT_LIST_HEAD(&inode->i_sb_list);
> +	}
> +	return inode;
> +}

This needs a much better name like new_inode_pseudo, and a kerneldoc 
comment explaining when it is safe to use, and the consequences, which
appear to me:

 - fs may never be unmount
 - quotas can't work on the filesystem
 - writeback can't work on the filesystem

> @@ -814,13 +829,9 @@ struct inode *new_inode(struct super_block *sb)
>  
>  	spin_lock_prefetch(&inode_sb_list_lock);
>  
> -	inode = alloc_inode(sb);
> -	if (inode) {
> -		spin_lock(&inode->i_lock);
> -		inode->i_state = 0;
> -		spin_unlock(&inode->i_lock);
> -		inode_sb_list_add(inode);
> -	}
> +	inode = __new_inode(sb);
> +	if (inode)
> +			inode_sb_list_add(inode);

bad indentation.


^ permalink raw reply

* [PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list
From: Eric Dumazet @ 2011-07-26  8:21 UTC (permalink / raw)
  To: Tim Chen, Al Viro, David Miller
  Cc: Christoph Hellwig, Andi Kleen, Matthew Wilcox, Anton Blanchard,
	npiggin, linux-kernel, linux-fsdevel, netdev
In-Reply-To: <1311660013.2996.6.camel@edumazet-laptop>

Le mardi 26 juillet 2011 à 08:00 +0200, Eric Dumazet a écrit :

> Next step is to not chain pipes/sockets into superblock s_inodes list
> 
> inode_sb_list_add()/inode_sb_list_del() is the very last contention
> point because of spin_lock(&inode_sb_list_lock);

Well, not 'last' contention point, as we still hit remove_inode_hash(),
inode_wb_list_del(), inode_lru_list_del(), but thats a clear win on my
2x4x2 machine : 9 seconds instead of 22 on a close(socket()) benchmark.


[PATCH] vfs: dont chain pipe/anon/socket on superblock s_inodes list

Workloads using pipes and sockets hit inode_sb_list_lock contention.

superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 fs/anon_inodes.c   |    2 +-
 fs/inode.c         |   31 +++++++++++++++++++++----------
 fs/pipe.c          |    2 +-
 include/linux/fs.h |    3 ++-
 net/socket.c       |    2 +-
 5 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 4d433d3..269499e 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -187,7 +187,7 @@ EXPORT_SYMBOL_GPL(anon_inode_getfd);
  */
 static struct inode *anon_inode_mkinode(void)
 {
-	struct inode *inode = new_inode(anon_inode_mnt->mnt_sb);
+	struct inode *inode = __new_inode(anon_inode_mnt->mnt_sb);
 
 	if (!inode)
 		return ERR_PTR(-ENOMEM);
diff --git a/fs/inode.c b/fs/inode.c
index 96c77b8..8a6d62b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -362,9 +362,11 @@ EXPORT_SYMBOL_GPL(inode_sb_list_add);
 
 static inline void inode_sb_list_del(struct inode *inode)
 {
-	spin_lock(&inode_sb_list_lock);
-	list_del_init(&inode->i_sb_list);
-	spin_unlock(&inode_sb_list_lock);
+	if (!list_empty(&inode->i_sb_list)) {
+		spin_lock(&inode_sb_list_lock);
+		list_del_init(&inode->i_sb_list);
+		spin_unlock(&inode_sb_list_lock);
+	}
 }
 
 static unsigned long hash(struct super_block *sb, unsigned long hashval)
@@ -796,6 +798,19 @@ unsigned int get_next_ino(void)
 }
 EXPORT_SYMBOL(get_next_ino);
 
+struct inode *__new_inode(struct super_block *sb)
+{
+	struct inode *inode = alloc_inode(sb);
+
+	if (inode) {
+		spin_lock(&inode->i_lock);
+		inode->i_state = 0;
+		spin_unlock(&inode->i_lock);
+		INIT_LIST_HEAD(&inode->i_sb_list);
+	}
+	return inode;
+}
+
 /**
  *	new_inode 	- obtain an inode
  *	@sb: superblock
@@ -814,13 +829,9 @@ struct inode *new_inode(struct super_block *sb)
 
 	spin_lock_prefetch(&inode_sb_list_lock);
 
-	inode = alloc_inode(sb);
-	if (inode) {
-		spin_lock(&inode->i_lock);
-		inode->i_state = 0;
-		spin_unlock(&inode->i_lock);
-		inode_sb_list_add(inode);
-	}
+	inode = __new_inode(sb);
+	if (inode)
+			inode_sb_list_add(inode);
 	return inode;
 }
 EXPORT_SYMBOL(new_inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index 1b7f9af..937b962 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -948,7 +948,7 @@ static const struct dentry_operations pipefs_dentry_operations = {
 
 static struct inode * get_pipe_inode(void)
 {
-	struct inode *inode = new_inode(pipe_mnt->mnt_sb);
+	struct inode *inode = __new_inode(pipe_mnt->mnt_sb);
 	struct pipe_inode_info *pipe;
 
 	if (!inode)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a665804..60be54f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2310,7 +2310,8 @@ extern void __iget(struct inode * inode);
 extern void iget_failed(struct inode *);
 extern void end_writeback(struct inode *);
 extern void __destroy_inode(struct inode *);
-extern struct inode *new_inode(struct super_block *);
+extern struct inode *__new_inode(struct super_block *sb);
+extern struct inode *new_inode(struct super_block *sb);
 extern void free_inode_nonrcu(struct inode *inode);
 extern int should_remove_suid(struct dentry *);
 extern int file_remove_suid(struct file *);
diff --git a/net/socket.c b/net/socket.c
index 02dc82d..b4b8a08 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -467,7 +467,7 @@ static struct socket *sock_alloc(void)
 	struct inode *inode;
 	struct socket *sock;
 
-	inode = new_inode(sock_mnt->mnt_sb);
+	inode = __new_inode(sock_mnt->mnt_sb);
 	if (!inode)
 		return NULL;
 


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: write() udp socket
From: ZHOU Xiaobo @ 2011-07-26  5:39 UTC (permalink / raw)
  To: Huajun Li; +Cc: netdev

------------------
Sincerely yours
                         ZHOU Xiaobo
 
 
 
------------------ Original ------------------
From:  "Huajun Li"<huajun.li.lee@gmail.com>;
Date:  Sun, Jul 24, 2011 04:33 PM
To:  "ZHOU Xiaobo"<xb.zhou@qq.com>;
Cc:  "netdev"<netdev@vger.kernel.org>;
Subject:  Re: write() udp socket
 
2011/7/23 ZHOU Xiaobo <xb.zhou@qq.com>:
> question No1:
> When I call
> ssize_t write(int fd, const void *buf, size_t count);
>
>
> on a nonblocking UDP socket, is the return value  always equal to 'count'?
>
>

I don't think so.  The function may be interrupt by signal or return
due to other reason, so the return value only represents the size it
writes successfully to the fd.



UDP is datagram, so I think it guarantees the 'buffer' in 'write()'  is entirely sent
like an atomic operate.


> question No2:
> Can I write() a UDP socket in multiple threads without locking?
>

In my opinion, you could. However, the receiver may not get what you expected.

what will happen? I only concern whether the application 'buffer' is sent partially which
is unacceptable.

>
> thanks
>
>
> ------------------
> Sincerely yours
>                         ZHOU Xiaobo

^ permalink raw reply

* Re: write() udp socket
From: ZHOU Xiaobo @ 2011-07-26  5:32 UTC (permalink / raw)
  To: Rick Jones, Huajun Li; +Cc: netdev

------------------
Sincerely yours
                         ZHOU Xiaobo
 
 
 
------------------ Original ------------------
From:  "Rick Jones"<rick.jones2@hp.com>;
Date:  Tue, Jul 26, 2011 01:38 AM
To:  "Huajun Li"<huajun.li.lee@gmail.com>;
Cc:  "ZHOU Xiaobo"<xb.zhou@qq.com>; "netdev"<netdev@vger.kernel.org>;
Subject:  Re: write() udp socket
 
On 07/24/2011 01:33 AM, Huajun Li wrote:
> 2011/7/23 ZHOU Xiaobo<xb.zhou@qq.com>:
>> question No1:
>> When I call
>> ssize_t write(int fd, const void *buf, size_t count);
>>
>>
>> on a nonblocking UDP socket, is the return value  always equal to 'count'?
>>
>>
>
> I don't think so.  The function may be interrupt by signal or return
> due to other reason, so the return value only represents the size it
> writes successfully to the fd.

I believe it should either appaear to succeed or fail.  write() best not 
be sending partial UDP datagrams.  That would be "bad."



yeah, the same as I think. If so the answer of Question No.2 is 'yes' too?

^ permalink raw reply

* Re: IPv6: autoconfiguration and suspend/resume or link down/up
From: Anirban Chakraborty @ 2011-07-26  5:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jiri Bohac, netdev, Herbert Xu, David Miller, stephen hemminger
In-Reply-To: <79ahktw3gy268nl9yyk76cxe.1311104529189@email.android.com>


On Jul 19, 2011, at 12:42 PM, Stephen Hemminger wrote:

> bridge forwarding table; route cache; and neighbor table could have same problem. I thought carrier is supposed to toggle on suspend or hibernate

In case of a VM using a VF as a NIC device, the VF would not know of a suspend event unless the hypervisor sends such a notification to the PF,
which the PF could relay back to the VF. Does KVM send such notification at present? Other option would be to bring down the interface in the
VM.

-Anirban



> 
> Jiri Bohac <jbohac@suse.cz> wrote:
> 
>> Hi,
>> 
>> I came over a surprising behaviour with IPv6 autoconfiguration,
>> which I think is a bug, but I would first like to hear other
>> people's opinions before trying to fix this:
>> 
>> Problem 1: all the address/route lifetimes are kept in jiffies
>> and jiffies don't get incremented on resume. So when a
>> route/address lifetime is 30 minutes and the system resumes after
>> 1 hour, the route/address should be considered expired, but it is
>> not.
>> 
>> Problem 2: when a system is moved to a new network a RS is not
>> sent. Thus, IPv6 does not autoconfigure until the router sends a
>> periodic RA. This can occur both while the system is alive and
>> while it is suspended. I think the autoconfigured state should be
>> discarded when the kernel suspects the system could have been
>> moved to a different network.
>> 
>> When the cable is unplugged and plugged in again, we already get
>> notified through linkwatch -> netdev_state_change ->
>> -> call_netdevice_notifiers(NETDEV_CHANGE, ...)
>> However, if the device has already been autoconfigured,
>> addrconf_notify() only handles this event by printing a
>> message.
>> 
>> So my idea was to:
>> - handle link up/down in addrconf_notify() similarly to
>> NETDEV_UP/NETDEV_DOWN
>> 
>> - on suspend, faking a link down event; on resume, faking a link up event
>> (or better, having a special event type for suspend/resume)
>> 
>> This would cause autoconfiguration to be restarted on resume as
>> well as cable plug/unplug, solving both the above problems.
>> 
>> Or do we want to completely rely on userspace tools
>> (networkmanager/ifplug) and expect them to do NETDEV_DOWN on
>> unplug/suspend and NETDEV_UP on plug/resume?
>> 
>> Any thoughts?
>> 
>> -- 
>> Jiri Bohac <jbohac@suse.cz>
>> SUSE Labs, SUSE CZ
>> 
> \x13��칻\x1c�&�~�&�\x18��+-��ݶ\x17��w��˛���m�޵ׯ�{ay�\x1dʇڙ�,j\a��f���h���z�\x1e�w���\f���j:+v���w�j�m����\a����zZ+�����ݢj"��!�i


^ permalink raw reply

* [PATCH net-next-2.6 2/2] be2net: use stats-sync to read/write 64-bit stats
From: Sathya Perla @ 2011-07-26  5:10 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1311657015-23465-1-git-send-email-sathya.perla@emulex.com>

64-bit stats in be2net are written/read as follows using the stats-sync
interface for safe access in 32-bit archs:

64-bit 		sync			writer			reader
stats
------------------------------------------------------------------------------
tx_stats	tx_stats->sync		be_xmit			be_get_stats64,
								ethtool
tx-compl	tx_stats->sync_compl	tx-compl-processing	ethtool
rx-stats	rx_stats->sync		rx-compl-processing	be_get_stats64,
								ethtool,
								eqd-update

This patch is based on Stephen Hemminger's earlier patch on the same issue...

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/benet/be.h         |    5 ++-
 drivers/net/benet/be_cmds.c    |    1 -
 drivers/net/benet/be_ethtool.c |   61 +++++++++++++++++++++----------
 drivers/net/benet/be_main.c    |   77 +++++++++++++++++++++++++---------------
 4 files changed, 93 insertions(+), 51 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index 68227fd..af57b51 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -29,6 +29,7 @@
 #include <linux/interrupt.h>
 #include <linux/firmware.h>
 #include <linux/slab.h>
+#include <linux/u64_stats_sync.h>
 
 #include "be_hw.h"
 
@@ -174,6 +175,8 @@ struct be_tx_stats {
 	u64 tx_compl;
 	ulong tx_jiffies;
 	u32 tx_stops;
+	struct u64_stats_sync sync;
+	struct u64_stats_sync sync_compl;
 };
 
 struct be_tx_obj {
@@ -206,6 +209,7 @@ struct be_rx_stats {
 	u32 rx_mcast_pkts;
 	u32 rx_compl_err;	/* completions with err set */
 	u32 rx_pps;		/* pkts per second */
+	struct u64_stats_sync sync;
 };
 
 struct be_rx_compl_info {
@@ -518,7 +522,6 @@ static inline bool be_multi_rxq(const struct be_adapter *adapter)
 extern void be_cq_notify(struct be_adapter *adapter, u16 qid, bool arm,
 		u16 num_popped);
 extern void be_link_status_update(struct be_adapter *adapter, bool link_up);
-extern void netdev_stats_update(struct be_adapter *adapter);
 extern void be_parse_stats(struct be_adapter *adapter);
 extern int be_load_fw(struct be_adapter *adapter, u8 *func);
 #endif				/* BE_H */
diff --git a/drivers/net/benet/be_cmds.c b/drivers/net/benet/be_cmds.c
index e15f06a..7dc4741 100644
--- a/drivers/net/benet/be_cmds.c
+++ b/drivers/net/benet/be_cmds.c
@@ -83,7 +83,6 @@ static int be_mcc_compl_process(struct be_adapter *adapter,
 			 (compl->tag0 == OPCODE_ETH_GET_PPORT_STATS)) &&
 			(compl->tag1 == CMD_SUBSYSTEM_ETH)) {
 			be_parse_stats(adapter);
-			netdev_stats_update(adapter);
 			adapter->stats_cmd_sent = false;
 		}
 	} else {
diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
index 0300b9d..e92a8d8 100644
--- a/drivers/net/benet/be_ethtool.c
+++ b/drivers/net/benet/be_ethtool.c
@@ -74,10 +74,12 @@ static const struct be_ethtool_stat et_stats[] = {
 };
 #define ETHTOOL_STATS_NUM ARRAY_SIZE(et_stats)
 
-/* Stats related to multi RX queues */
+/* Stats related to multi RX queues: get_stats routine assumes bytes, pkts
+ * are first and second members respectively.
+ */
 static const struct be_ethtool_stat et_rx_stats[] = {
-	{DRVSTAT_RX_INFO(rx_bytes)},
-	{DRVSTAT_RX_INFO(rx_pkts)},
+	{DRVSTAT_RX_INFO(rx_bytes)},/* If moving this member see above note */
+	{DRVSTAT_RX_INFO(rx_pkts)}, /* If moving this member see above note */
 	{DRVSTAT_RX_INFO(rx_polls)},
 	{DRVSTAT_RX_INFO(rx_events)},
 	{DRVSTAT_RX_INFO(rx_compl)},
@@ -88,8 +90,11 @@ static const struct be_ethtool_stat et_rx_stats[] = {
 };
 #define ETHTOOL_RXSTATS_NUM (ARRAY_SIZE(et_rx_stats))
 
-/* Stats related to multi TX queues */
+/* Stats related to multi TX queues: get_stats routine assumes compl is the
+ * first member
+ */
 static const struct be_ethtool_stat et_tx_stats[] = {
+	{DRVSTAT_TX_INFO(tx_compl)}, /* If moving this member see above note */
 	{DRVSTAT_TX_INFO(tx_bytes)},
 	{DRVSTAT_TX_INFO(tx_pkts)},
 	{DRVSTAT_TX_INFO(tx_reqs)},
@@ -243,32 +248,48 @@ be_get_ethtool_stats(struct net_device *netdev,
 	struct be_rx_obj *rxo;
 	struct be_tx_obj *txo;
 	void *p;
-	int i, j, base;
+	unsigned int i, j, base = 0, start;
 
 	for (i = 0; i < ETHTOOL_STATS_NUM; i++) {
 		p = (u8 *)&adapter->drv_stats + et_stats[i].offset;
-		data[i] = (et_stats[i].size == sizeof(u64)) ?
-				*(u64 *)p: *(u32 *)p;
+		data[i] = *(u32 *)p;
 	}
+	base += ETHTOOL_STATS_NUM;
 
-	base = ETHTOOL_STATS_NUM;
 	for_all_rx_queues(adapter, rxo, j) {
-		for (i = 0; i < ETHTOOL_RXSTATS_NUM; i++) {
-			p = (u8 *)rx_stats(rxo) + et_rx_stats[i].offset;
-			data[base + j * ETHTOOL_RXSTATS_NUM + i] =
-				(et_rx_stats[i].size == sizeof(u64)) ?
-					*(u64 *)p: *(u32 *)p;
+		struct be_rx_stats *stats = rx_stats(rxo);
+
+		do {
+			start = u64_stats_fetch_begin_bh(&stats->sync);
+			data[base] = stats->rx_bytes;
+			data[base + 1] = stats->rx_pkts;
+		} while (u64_stats_fetch_retry_bh(&stats->sync, start));
+
+		for (i = 2; i < ETHTOOL_RXSTATS_NUM; i++) {
+			p = (u8 *)stats + et_rx_stats[i].offset;
+			data[base + i] = *(u32 *)p;
 		}
+		base += ETHTOOL_RXSTATS_NUM;
 	}
 
-	base = ETHTOOL_STATS_NUM + adapter->num_rx_qs * ETHTOOL_RXSTATS_NUM;
 	for_all_tx_queues(adapter, txo, j) {
-		for (i = 0; i < ETHTOOL_TXSTATS_NUM; i++) {
-			p = (u8 *)tx_stats(txo) + et_tx_stats[i].offset;
-			data[base + j * ETHTOOL_TXSTATS_NUM + i] =
-				(et_tx_stats[i].size == sizeof(u64)) ?
-					*(u64 *)p: *(u32 *)p;
-		}
+		struct be_tx_stats *stats = tx_stats(txo);
+
+		do {
+			start = u64_stats_fetch_begin_bh(&stats->sync_compl);
+			data[base] = stats->tx_compl;
+		} while (u64_stats_fetch_retry_bh(&stats->sync_compl, start));
+
+		do {
+			start = u64_stats_fetch_begin_bh(&stats->sync);
+			for (i = 1; i < ETHTOOL_TXSTATS_NUM; i++) {
+				p = (u8 *)stats + et_tx_stats[i].offset;
+				data[base + i] =
+					(et_tx_stats[i].size == sizeof(u64)) ?
+						*(u64 *)p : *(u32 *)p;
+			}
+		} while (u64_stats_fetch_retry_bh(&stats->sync, start));
+		base += ETHTOOL_TXSTATS_NUM;
 	}
 }
 
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 9cfbfdf..9f2f66c 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -396,36 +396,44 @@ void be_parse_stats(struct be_adapter *adapter)
 			erx->rx_drops_no_fragments[rxo->q.id];
 }
 
-void netdev_stats_update(struct be_adapter *adapter)
+static struct rtnl_link_stats64 *be_get_stats64(struct net_device *netdev,
+					struct rtnl_link_stats64 *stats)
 {
+	struct be_adapter *adapter = netdev_priv(netdev);
 	struct be_drv_stats *drvs = &adapter->drv_stats;
-	struct net_device_stats *dev_stats = &adapter->netdev->stats;
 	struct be_rx_obj *rxo;
 	struct be_tx_obj *txo;
-	unsigned long pkts = 0, bytes = 0, mcast = 0, drops = 0;
+	u64 pkts, bytes;
+	unsigned int start;
 	int i;
 
 	for_all_rx_queues(adapter, rxo, i) {
-		pkts += rx_stats(rxo)->rx_pkts;
-		bytes += rx_stats(rxo)->rx_bytes;
-		mcast += rx_stats(rxo)->rx_mcast_pkts;
-		drops += rx_stats(rxo)->rx_drops_no_skbs;
+		const struct be_rx_stats *rx_stats = rx_stats(rxo);
+		do {
+			start = u64_stats_fetch_begin_bh(&rx_stats->sync);
+			pkts = rx_stats(rxo)->rx_pkts;
+			bytes = rx_stats(rxo)->rx_bytes;
+		} while (u64_stats_fetch_retry_bh(&rx_stats->sync, start));
+		stats->rx_packets += pkts;
+		stats->rx_bytes += bytes;
+		stats->multicast += rx_stats(rxo)->rx_mcast_pkts;
+		stats->rx_dropped += rx_stats(rxo)->rx_drops_no_skbs +
+					rx_stats(rxo)->rx_drops_no_frags;
 	}
-	dev_stats->rx_packets = pkts;
-	dev_stats->rx_bytes = bytes;
-	dev_stats->multicast = mcast;
-	dev_stats->rx_dropped = drops;
 
-	pkts = bytes = 0;
 	for_all_tx_queues(adapter, txo, i) {
-		pkts += tx_stats(txo)->tx_pkts;
-		bytes += tx_stats(txo)->tx_bytes;
+		const struct be_tx_stats *tx_stats = tx_stats(txo);
+		do {
+			start = u64_stats_fetch_begin_bh(&tx_stats->sync);
+			pkts = tx_stats(txo)->tx_pkts;
+			bytes = tx_stats(txo)->tx_bytes;
+		} while (u64_stats_fetch_retry_bh(&tx_stats->sync, start));
+		stats->tx_packets += pkts;
+		stats->tx_bytes += bytes;
 	}
-	dev_stats->tx_packets = pkts;
-	dev_stats->tx_bytes = bytes;
 
 	/* bad pkts received */
-	dev_stats->rx_errors = drvs->rx_crc_errors +
+	stats->rx_errors = drvs->rx_crc_errors +
 		drvs->rx_alignment_symbol_errors +
 		drvs->rx_in_range_errors +
 		drvs->rx_out_range_errors +
@@ -434,26 +442,24 @@ void netdev_stats_update(struct be_adapter *adapter)
 		drvs->rx_dropped_too_short +
 		drvs->rx_dropped_header_too_small +
 		drvs->rx_dropped_tcp_length +
-		drvs->rx_dropped_runt +
-		drvs->rx_tcp_checksum_errs +
-		drvs->rx_ip_checksum_errs +
-		drvs->rx_udp_checksum_errs;
+		drvs->rx_dropped_runt;
 
 	/* detailed rx errors */
-	dev_stats->rx_length_errors = drvs->rx_in_range_errors +
+	stats->rx_length_errors = drvs->rx_in_range_errors +
 		drvs->rx_out_range_errors +
 		drvs->rx_frame_too_long;
 
-	dev_stats->rx_crc_errors = drvs->rx_crc_errors;
+	stats->rx_crc_errors = drvs->rx_crc_errors;
 
 	/* frame alignment errors */
-	dev_stats->rx_frame_errors = drvs->rx_alignment_symbol_errors;
+	stats->rx_frame_errors = drvs->rx_alignment_symbol_errors;
 
 	/* receiver fifo overrun */
 	/* drops_no_pbuf is no per i/f, it's per BE card */
-	dev_stats->rx_fifo_errors = drvs->rxpp_fifo_overflow_drop +
+	stats->rx_fifo_errors = drvs->rxpp_fifo_overflow_drop +
 				drvs->rx_input_fifo_overflow_drop +
 				drvs->rx_drops_no_pbuf;
+	return stats;
 }
 
 void be_link_status_update(struct be_adapter *adapter, bool link_up)
@@ -479,12 +485,14 @@ static void be_tx_stats_update(struct be_tx_obj *txo,
 {
 	struct be_tx_stats *stats = tx_stats(txo);
 
+	u64_stats_update_begin(&stats->sync);
 	stats->tx_reqs++;
 	stats->tx_wrbs += wrb_cnt;
 	stats->tx_bytes += copied;
 	stats->tx_pkts += (gso_segs ? gso_segs : 1);
 	if (stopped)
 		stats->tx_stops++;
+	u64_stats_update_end(&stats->sync);
 }
 
 /* Determine number of WRB entries needed to xmit data in an skb */
@@ -905,7 +913,8 @@ static void be_rx_eqd_update(struct be_adapter *adapter, struct be_rx_obj *rxo)
 	struct be_rx_stats *stats = rx_stats(rxo);
 	ulong now = jiffies;
 	ulong delta = now - stats->rx_jiffies;
-	u32 eqd;
+	u64 pkts;
+	unsigned int start, eqd;
 
 	if (!rx_eq->enable_aic)
 		return;
@@ -920,8 +929,13 @@ static void be_rx_eqd_update(struct be_adapter *adapter, struct be_rx_obj *rxo)
 	if (delta < HZ)
 		return;
 
-	stats->rx_pps = (stats->rx_pkts - stats->rx_pkts_prev) / (delta / HZ);
-	stats->rx_pkts_prev = stats->rx_pkts;
+	do {
+		start = u64_stats_fetch_begin_bh(&stats->sync);
+		pkts = stats->rx_pkts;
+	} while (u64_stats_fetch_retry_bh(&stats->sync, start));
+
+	stats->rx_pps = (pkts - stats->rx_pkts_prev) / (delta / HZ);
+	stats->rx_pkts_prev = pkts;
 	stats->rx_jiffies = now;
 	eqd = stats->rx_pps / 110000;
 	eqd = eqd << 3;
@@ -942,6 +956,7 @@ static void be_rx_stats_update(struct be_rx_obj *rxo,
 {
 	struct be_rx_stats *stats = rx_stats(rxo);
 
+	u64_stats_update_begin(&stats->sync);
 	stats->rx_compl++;
 	stats->rx_bytes += rxcp->pkt_size;
 	stats->rx_pkts++;
@@ -949,6 +964,7 @@ static void be_rx_stats_update(struct be_rx_obj *rxo,
 		stats->rx_mcast_pkts++;
 	if (rxcp->err)
 		stats->rx_compl_err++;
+	u64_stats_update_end(&stats->sync);
 }
 
 static inline bool csum_passed(struct be_rx_compl_info *rxcp)
@@ -1878,8 +1894,9 @@ static int be_poll_tx_mcc(struct napi_struct *napi, int budget)
 				netif_wake_subqueue(adapter->netdev, i);
 			}
 
-			adapter->drv_stats.tx_events++;
+			u64_stats_update_begin(&tx_stats(txo)->sync_compl);
 			tx_stats(txo)->tx_compl += tx_compl;
+			u64_stats_update_end(&tx_stats(txo)->sync_compl);
 		}
 	}
 
@@ -1893,6 +1910,7 @@ static int be_poll_tx_mcc(struct napi_struct *napi, int budget)
 	napi_complete(napi);
 
 	be_eq_notify(adapter, tx_eq->q.id, true, false, 0);
+	adapter->drv_stats.tx_events++;
 	return 1;
 }
 
@@ -2843,6 +2861,7 @@ static struct net_device_ops be_netdev_ops = {
 	.ndo_set_rx_mode	= be_set_multicast_list,
 	.ndo_set_mac_address	= be_mac_addr_set,
 	.ndo_change_mtu		= be_change_mtu,
+	.ndo_get_stats64	= be_get_stats64,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_vlan_rx_add_vid	= be_vlan_add_vid,
 	.ndo_vlan_rx_kill_vid	= be_vlan_rem_vid,
-- 
1.7.4


^ permalink raw reply related

* [PATCH net-next-2.6 1/2] be2net: cleanup and refactor stats code
From: Sathya Perla @ 2011-07-26  5:10 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1311657015-23465-1-git-send-email-sathya.perla@emulex.com>

In preparation for 64-bit stats interface, the following cleanups help
streamline the code:
1) made some more rx/tx stats stored by driver 64 bit
2) made some HW stas (err/drop counters) stored in be_drv_stats 32 bit to
   keep the code simple as BE provides 32-bit counters only.
3) removed duplication of netdev stats in ethtool
4) removed some un-necessary stats and fixed some names

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
---
 drivers/net/benet/be.h         |  110 ++++++++---------
 drivers/net/benet/be_cmds.c    |   20 ---
 drivers/net/benet/be_cmds.h    |   53 +-------
 drivers/net/benet/be_ethtool.c |   65 +++--------
 drivers/net/benet/be_main.c    |  263 +++++++++++++---------------------------
 5 files changed, 155 insertions(+), 356 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index c85768c..68227fd 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -167,15 +167,13 @@ struct be_mcc_obj {
 };
 
 struct be_tx_stats {
-	u32 be_tx_reqs;		/* number of TX requests initiated */
-	u32 be_tx_stops;	/* number of times TX Q was stopped */
-	u32 be_tx_wrbs;		/* number of tx WRBs used */
-	u32 be_tx_compl;	/* number of tx completion entries processed */
-	ulong be_tx_jiffies;
-	u64 be_tx_bytes;
-	u64 be_tx_bytes_prev;
-	u64 be_tx_pkts;
-	u32 be_tx_rate;
+	u64 tx_bytes;
+	u64 tx_pkts;
+	u64 tx_reqs;
+	u64 tx_wrbs;
+	u64 tx_compl;
+	ulong tx_jiffies;
+	u32 tx_stops;
 };
 
 struct be_tx_obj {
@@ -195,22 +193,19 @@ struct be_rx_page_info {
 };
 
 struct be_rx_stats {
-	u32 rx_post_fail;/* number of ethrx buffer alloc failures */
-	u32 rx_polls;	/* number of times NAPI called poll function */
-	u32 rx_events;	/* number of ucast rx completion events  */
-	u32 rx_compl;	/* number of rx completion entries processed */
-	ulong rx_dropped; /* number of skb allocation errors */
-	ulong rx_jiffies;
 	u64 rx_bytes;
-	u64 rx_bytes_prev;
 	u64 rx_pkts;
-	u32 rx_rate;
+	u64 rx_pkts_prev;
+	ulong rx_jiffies;
+	u32 rx_drops_no_skbs;	/* skb allocation errors */
+	u32 rx_drops_no_frags;	/* HW has no fetched frags */
+	u32 rx_post_fail;	/* page post alloc failures */
+	u32 rx_polls;		/* NAPI calls */
+	u32 rx_events;
+	u32 rx_compl;
 	u32 rx_mcast_pkts;
-	u32 rxcp_err;	/* Num rx completion entries w/ err set. */
-	ulong rx_fps_jiffies;	/* jiffies at last FPS calc */
-	u32 rx_frags;
-	u32 prev_rx_frags;
-	u32 rx_fps;		/* Rx frags per second */
+	u32 rx_compl_err;	/* completions with err set */
+	u32 rx_pps;		/* pkts per second */
 };
 
 struct be_rx_compl_info {
@@ -247,43 +242,40 @@ struct be_rx_obj {
 
 struct be_drv_stats {
 	u8 be_on_die_temperature;
-	u64 be_tx_events;
-	u64 eth_red_drops;
-	u64 rx_drops_no_pbuf;
-	u64 rx_drops_no_txpb;
-	u64 rx_drops_no_erx_descr;
-	u64 rx_drops_no_tpre_descr;
-	u64 rx_drops_too_many_frags;
-	u64 rx_drops_invalid_ring;
-	u64 forwarded_packets;
-	u64 rx_drops_mtu;
-	u64 rx_crc_errors;
-	u64 rx_alignment_symbol_errors;
-	u64 rx_pause_frames;
-	u64 rx_priority_pause_frames;
-	u64 rx_control_frames;
-	u64 rx_in_range_errors;
-	u64 rx_out_range_errors;
-	u64 rx_frame_too_long;
-	u64 rx_address_match_errors;
-	u64 rx_dropped_too_small;
-	u64 rx_dropped_too_short;
-	u64 rx_dropped_header_too_small;
-	u64 rx_dropped_tcp_length;
-	u64 rx_dropped_runt;
-	u64 rx_ip_checksum_errs;
-	u64 rx_tcp_checksum_errs;
-	u64 rx_udp_checksum_errs;
-	u64 rx_switched_unicast_packets;
-	u64 rx_switched_multicast_packets;
-	u64 rx_switched_broadcast_packets;
-	u64 tx_pauseframes;
-	u64 tx_priority_pauseframes;
-	u64 tx_controlframes;
-	u64 rxpp_fifo_overflow_drop;
-	u64 rx_input_fifo_overflow_drop;
-	u64 pmem_fifo_overflow_drop;
-	u64 jabber_events;
+	u32 tx_events;
+	u32 eth_red_drops;
+	u32 rx_drops_no_pbuf;
+	u32 rx_drops_no_txpb;
+	u32 rx_drops_no_erx_descr;
+	u32 rx_drops_no_tpre_descr;
+	u32 rx_drops_too_many_frags;
+	u32 rx_drops_invalid_ring;
+	u32 forwarded_packets;
+	u32 rx_drops_mtu;
+	u32 rx_crc_errors;
+	u32 rx_alignment_symbol_errors;
+	u32 rx_pause_frames;
+	u32 rx_priority_pause_frames;
+	u32 rx_control_frames;
+	u32 rx_in_range_errors;
+	u32 rx_out_range_errors;
+	u32 rx_frame_too_long;
+	u32 rx_address_match_errors;
+	u32 rx_dropped_too_small;
+	u32 rx_dropped_too_short;
+	u32 rx_dropped_header_too_small;
+	u32 rx_dropped_tcp_length;
+	u32 rx_dropped_runt;
+	u32 rx_ip_checksum_errs;
+	u32 rx_tcp_checksum_errs;
+	u32 rx_udp_checksum_errs;
+	u32 tx_pauseframes;
+	u32 tx_priority_pauseframes;
+	u32 tx_controlframes;
+	u32 rxpp_fifo_overflow_drop;
+	u32 rx_input_fifo_overflow_drop;
+	u32 pmem_fifo_overflow_drop;
+	u32 jabber_events;
 };
 
 struct be_vf_cfg {
diff --git a/drivers/net/benet/be_cmds.c b/drivers/net/benet/be_cmds.c
index 054fa67..e15f06a 100644
--- a/drivers/net/benet/be_cmds.c
+++ b/drivers/net/benet/be_cmds.c
@@ -82,26 +82,6 @@ static int be_mcc_compl_process(struct be_adapter *adapter,
 		if (((compl->tag0 == OPCODE_ETH_GET_STATISTICS) ||
 			 (compl->tag0 == OPCODE_ETH_GET_PPORT_STATS)) &&
 			(compl->tag1 == CMD_SUBSYSTEM_ETH)) {
-			if (adapter->generation == BE_GEN3) {
-				if (lancer_chip(adapter)) {
-					struct lancer_cmd_resp_pport_stats
-						*resp = adapter->stats_cmd.va;
-					be_dws_le_to_cpu(&resp->pport_stats,
-						sizeof(resp->pport_stats));
-				} else {
-					struct be_cmd_resp_get_stats_v1 *resp =
-							adapter->stats_cmd.va;
-
-				be_dws_le_to_cpu(&resp->hw_stats,
-							sizeof(resp->hw_stats));
-				}
-			} else {
-				struct be_cmd_resp_get_stats_v0 *resp =
-							adapter->stats_cmd.va;
-
-				be_dws_le_to_cpu(&resp->hw_stats,
-							sizeof(resp->hw_stats));
-			}
 			be_parse_stats(adapter);
 			netdev_stats_update(adapter);
 			adapter->stats_cmd_sent = false;
diff --git a/drivers/net/benet/be_cmds.h b/drivers/net/benet/be_cmds.h
index 8e4d488..d3342c4 100644
--- a/drivers/net/benet/be_cmds.h
+++ b/drivers/net/benet/be_cmds.h
@@ -693,8 +693,7 @@ struct be_cmd_resp_get_stats_v0 {
 	struct be_hw_stats_v0 hw_stats;
 };
 
-#define make_64bit_val(hi_32, lo_32)	(((u64)hi_32<<32) | lo_32)
-struct lancer_cmd_pport_stats {
+struct lancer_pport_stats {
 	u32 tx_packets_lo;
 	u32 tx_packets_hi;
 	u32 tx_unicast_packets_lo;
@@ -871,16 +870,16 @@ struct lancer_cmd_req_pport_stats {
 	struct be_cmd_req_hdr hdr;
 	union {
 		struct pport_stats_params params;
-		u8 rsvd[sizeof(struct lancer_cmd_pport_stats)];
+		u8 rsvd[sizeof(struct lancer_pport_stats)];
 	} cmd_params;
 };
 
 struct lancer_cmd_resp_pport_stats {
 	struct be_cmd_resp_hdr hdr;
-	struct lancer_cmd_pport_stats pport_stats;
+	struct lancer_pport_stats pport_stats;
 };
 
-static inline  struct lancer_cmd_pport_stats*
+static inline struct lancer_pport_stats*
 	pport_stats_from_cmd(struct be_adapter *adapter)
 {
 	struct lancer_cmd_resp_pport_stats *cmd = adapter->stats_cmd.va;
@@ -1383,8 +1382,7 @@ struct be_cmd_resp_get_stats_v1 {
 	struct be_hw_stats_v1 hw_stats;
 };
 
-static inline void *
-hw_stats_from_cmd(struct be_adapter *adapter)
+static inline void *hw_stats_from_cmd(struct be_adapter *adapter)
 {
 	if (adapter->generation == BE_GEN3) {
 		struct be_cmd_resp_get_stats_v1 *cmd = adapter->stats_cmd.va;
@@ -1397,34 +1395,6 @@ hw_stats_from_cmd(struct be_adapter *adapter)
 	}
 }
 
-static inline void *be_port_rxf_stats_from_cmd(struct be_adapter *adapter)
-{
-	if (adapter->generation == BE_GEN3) {
-		struct be_hw_stats_v1 *hw_stats = hw_stats_from_cmd(adapter);
-		struct be_rxf_stats_v1 *rxf_stats = &hw_stats->rxf;
-
-		return &rxf_stats->port[adapter->port_num];
-	} else {
-		struct be_hw_stats_v0 *hw_stats = hw_stats_from_cmd(adapter);
-		struct be_rxf_stats_v0 *rxf_stats = &hw_stats->rxf;
-
-		return &rxf_stats->port[adapter->port_num];
-	}
-}
-
-static inline void *be_rxf_stats_from_cmd(struct be_adapter *adapter)
-{
-	if (adapter->generation == BE_GEN3) {
-		struct be_hw_stats_v1 *hw_stats = hw_stats_from_cmd(adapter);
-
-		return &hw_stats->rxf;
-	} else {
-		struct be_hw_stats_v0 *hw_stats = hw_stats_from_cmd(adapter);
-
-		return &hw_stats->rxf;
-	}
-}
-
 static inline void *be_erx_stats_from_cmd(struct be_adapter *adapter)
 {
 	if (adapter->generation == BE_GEN3) {
@@ -1438,19 +1408,6 @@ static inline void *be_erx_stats_from_cmd(struct be_adapter *adapter)
 	}
 }
 
-static inline void *be_pmem_stats_from_cmd(struct be_adapter *adapter)
-{
-	if (adapter->generation == BE_GEN3) {
-		struct be_hw_stats_v1 *hw_stats = hw_stats_from_cmd(adapter);
-
-		return &hw_stats->pmem;
-	} else {
-		struct be_hw_stats_v0 *hw_stats = hw_stats_from_cmd(adapter);
-
-		return &hw_stats->pmem;
-	}
-}
-
 extern int be_pci_fnum_get(struct be_adapter *adapter);
 extern int be_cmd_POST(struct be_adapter *adapter);
 extern int be_cmd_mac_addr_query(struct be_adapter *adapter, u8 *mac_addr,
diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
index 7fd8130..0300b9d 100644
--- a/drivers/net/benet/be_ethtool.c
+++ b/drivers/net/benet/be_ethtool.c
@@ -26,33 +26,18 @@ struct be_ethtool_stat {
 	int offset;
 };
 
-enum {NETSTAT, DRVSTAT_TX, DRVSTAT_RX, ERXSTAT,
-			DRVSTAT};
+enum {DRVSTAT_TX, DRVSTAT_RX, DRVSTAT};
 #define FIELDINFO(_struct, field) FIELD_SIZEOF(_struct, field), \
 					offsetof(_struct, field)
-#define NETSTAT_INFO(field) 	#field, NETSTAT,\
-					FIELDINFO(struct net_device_stats,\
-						field)
 #define DRVSTAT_TX_INFO(field)	#field, DRVSTAT_TX,\
 					FIELDINFO(struct be_tx_stats, field)
 #define DRVSTAT_RX_INFO(field)	#field, DRVSTAT_RX,\
 					FIELDINFO(struct be_rx_stats, field)
-#define ERXSTAT_INFO(field)	#field, ERXSTAT,\
-					FIELDINFO(struct be_erx_stats_v1, field)
 #define	DRVSTAT_INFO(field)	#field, DRVSTAT,\
-					FIELDINFO(struct be_drv_stats, \
-						field)
+					FIELDINFO(struct be_drv_stats, field)
 
 static const struct be_ethtool_stat et_stats[] = {
-	{NETSTAT_INFO(rx_packets)},
-	{NETSTAT_INFO(tx_packets)},
-	{NETSTAT_INFO(rx_bytes)},
-	{NETSTAT_INFO(tx_bytes)},
-	{NETSTAT_INFO(rx_errors)},
-	{NETSTAT_INFO(tx_errors)},
-	{NETSTAT_INFO(rx_dropped)},
-	{NETSTAT_INFO(tx_dropped)},
-	{DRVSTAT_INFO(be_tx_events)},
+	{DRVSTAT_INFO(tx_events)},
 	{DRVSTAT_INFO(rx_crc_errors)},
 	{DRVSTAT_INFO(rx_alignment_symbol_errors)},
 	{DRVSTAT_INFO(rx_pause_frames)},
@@ -71,9 +56,6 @@ static const struct be_ethtool_stat et_stats[] = {
 	{DRVSTAT_INFO(rx_ip_checksum_errs)},
 	{DRVSTAT_INFO(rx_tcp_checksum_errs)},
 	{DRVSTAT_INFO(rx_udp_checksum_errs)},
-	{DRVSTAT_INFO(rx_switched_unicast_packets)},
-	{DRVSTAT_INFO(rx_switched_multicast_packets)},
-	{DRVSTAT_INFO(rx_switched_broadcast_packets)},
 	{DRVSTAT_INFO(tx_pauseframes)},
 	{DRVSTAT_INFO(tx_controlframes)},
 	{DRVSTAT_INFO(rx_priority_pause_frames)},
@@ -96,24 +78,24 @@ static const struct be_ethtool_stat et_stats[] = {
 static const struct be_ethtool_stat et_rx_stats[] = {
 	{DRVSTAT_RX_INFO(rx_bytes)},
 	{DRVSTAT_RX_INFO(rx_pkts)},
-	{DRVSTAT_RX_INFO(rx_rate)},
 	{DRVSTAT_RX_INFO(rx_polls)},
 	{DRVSTAT_RX_INFO(rx_events)},
 	{DRVSTAT_RX_INFO(rx_compl)},
 	{DRVSTAT_RX_INFO(rx_mcast_pkts)},
 	{DRVSTAT_RX_INFO(rx_post_fail)},
-	{DRVSTAT_RX_INFO(rx_dropped)},
-	{ERXSTAT_INFO(rx_drops_no_fragments)}
+	{DRVSTAT_RX_INFO(rx_drops_no_skbs)},
+	{DRVSTAT_RX_INFO(rx_drops_no_frags)}
 };
 #define ETHTOOL_RXSTATS_NUM (ARRAY_SIZE(et_rx_stats))
 
 /* Stats related to multi TX queues */
 static const struct be_ethtool_stat et_tx_stats[] = {
-	{DRVSTAT_TX_INFO(be_tx_rate)},
-	{DRVSTAT_TX_INFO(be_tx_reqs)},
-	{DRVSTAT_TX_INFO(be_tx_wrbs)},
-	{DRVSTAT_TX_INFO(be_tx_stops)},
-	{DRVSTAT_TX_INFO(be_tx_compl)}
+	{DRVSTAT_TX_INFO(tx_bytes)},
+	{DRVSTAT_TX_INFO(tx_pkts)},
+	{DRVSTAT_TX_INFO(tx_reqs)},
+	{DRVSTAT_TX_INFO(tx_wrbs)},
+	{DRVSTAT_TX_INFO(tx_compl)},
+	{DRVSTAT_TX_INFO(tx_stops)}
 };
 #define ETHTOOL_TXSTATS_NUM (ARRAY_SIZE(et_tx_stats))
 
@@ -260,20 +242,11 @@ be_get_ethtool_stats(struct net_device *netdev,
 	struct be_adapter *adapter = netdev_priv(netdev);
 	struct be_rx_obj *rxo;
 	struct be_tx_obj *txo;
-	void *p = NULL;
+	void *p;
 	int i, j, base;
 
 	for (i = 0; i < ETHTOOL_STATS_NUM; i++) {
-		switch (et_stats[i].type) {
-		case NETSTAT:
-			p = &netdev->stats;
-			break;
-		case DRVSTAT:
-			p = &adapter->drv_stats;
-			break;
-		}
-
-		p = (u8 *)p + et_stats[i].offset;
+		p = (u8 *)&adapter->drv_stats + et_stats[i].offset;
 		data[i] = (et_stats[i].size == sizeof(u64)) ?
 				*(u64 *)p: *(u32 *)p;
 	}
@@ -281,15 +254,7 @@ be_get_ethtool_stats(struct net_device *netdev,
 	base = ETHTOOL_STATS_NUM;
 	for_all_rx_queues(adapter, rxo, j) {
 		for (i = 0; i < ETHTOOL_RXSTATS_NUM; i++) {
-			switch (et_rx_stats[i].type) {
-			case DRVSTAT_RX:
-				p = (u8 *)&rxo->stats + et_rx_stats[i].offset;
-				break;
-			case ERXSTAT:
-				p = (u32 *)be_erx_stats_from_cmd(adapter) +
-								rxo->q.id;
-				break;
-			}
+			p = (u8 *)rx_stats(rxo) + et_rx_stats[i].offset;
 			data[base + j * ETHTOOL_RXSTATS_NUM + i] =
 				(et_rx_stats[i].size == sizeof(u64)) ?
 					*(u64 *)p: *(u32 *)p;
@@ -299,7 +264,7 @@ be_get_ethtool_stats(struct net_device *netdev,
 	base = ETHTOOL_STATS_NUM + adapter->num_rx_qs * ETHTOOL_RXSTATS_NUM;
 	for_all_tx_queues(adapter, txo, j) {
 		for (i = 0; i < ETHTOOL_TXSTATS_NUM; i++) {
-			p = (u8 *)&txo->stats + et_tx_stats[i].offset;
+			p = (u8 *)tx_stats(txo) + et_tx_stats[i].offset;
 			data[base + j * ETHTOOL_TXSTATS_NUM + i] =
 				(et_tx_stats[i].size == sizeof(u64)) ?
 					*(u64 *)p: *(u32 *)p;
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index c411bb1..9cfbfdf 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -245,14 +245,14 @@ netdev_addr:
 
 static void populate_be2_stats(struct be_adapter *adapter)
 {
-
-	struct be_drv_stats *drvs = &adapter->drv_stats;
-	struct be_pmem_stats *pmem_sts = be_pmem_stats_from_cmd(adapter);
+	struct be_hw_stats_v0 *hw_stats = hw_stats_from_cmd(adapter);
+	struct be_pmem_stats *pmem_sts = &hw_stats->pmem;
+	struct be_rxf_stats_v0 *rxf_stats = &hw_stats->rxf;
 	struct be_port_rxf_stats_v0 *port_stats =
-		be_port_rxf_stats_from_cmd(adapter);
-	struct be_rxf_stats_v0 *rxf_stats =
-		be_rxf_stats_from_cmd(adapter);
+					&rxf_stats->port[adapter->port_num];
+	struct be_drv_stats *drvs = &adapter->drv_stats;
 
+	be_dws_le_to_cpu(hw_stats, sizeof(*hw_stats));
 	drvs->rx_pause_frames = port_stats->rx_pause_frames;
 	drvs->rx_crc_errors = port_stats->rx_crc_errors;
 	drvs->rx_control_frames = port_stats->rx_control_frames;
@@ -267,12 +267,10 @@ static void populate_be2_stats(struct be_adapter *adapter)
 	drvs->rx_dropped_too_small = port_stats->rx_dropped_too_small;
 	drvs->rx_dropped_too_short = port_stats->rx_dropped_too_short;
 	drvs->rx_out_range_errors = port_stats->rx_out_range_errors;
-	drvs->rx_input_fifo_overflow_drop =
-		port_stats->rx_input_fifo_overflow;
+	drvs->rx_input_fifo_overflow_drop = port_stats->rx_input_fifo_overflow;
 	drvs->rx_dropped_header_too_small =
 		port_stats->rx_dropped_header_too_small;
-	drvs->rx_address_match_errors =
-		port_stats->rx_address_match_errors;
+	drvs->rx_address_match_errors = port_stats->rx_address_match_errors;
 	drvs->rx_alignment_symbol_errors =
 		port_stats->rx_alignment_symbol_errors;
 
@@ -280,36 +278,30 @@ static void populate_be2_stats(struct be_adapter *adapter)
 	drvs->tx_controlframes = port_stats->tx_controlframes;
 
 	if (adapter->port_num)
-		drvs->jabber_events =
-			rxf_stats->port1_jabber_events;
+		drvs->jabber_events = rxf_stats->port1_jabber_events;
 	else
-		drvs->jabber_events =
-			rxf_stats->port0_jabber_events;
+		drvs->jabber_events = rxf_stats->port0_jabber_events;
 	drvs->rx_drops_no_pbuf = rxf_stats->rx_drops_no_pbuf;
 	drvs->rx_drops_no_txpb = rxf_stats->rx_drops_no_txpb;
 	drvs->rx_drops_no_erx_descr = rxf_stats->rx_drops_no_erx_descr;
 	drvs->rx_drops_invalid_ring = rxf_stats->rx_drops_invalid_ring;
 	drvs->forwarded_packets = rxf_stats->forwarded_packets;
 	drvs->rx_drops_mtu = rxf_stats->rx_drops_mtu;
-	drvs->rx_drops_no_tpre_descr =
-		rxf_stats->rx_drops_no_tpre_descr;
-	drvs->rx_drops_too_many_frags =
-		rxf_stats->rx_drops_too_many_frags;
+	drvs->rx_drops_no_tpre_descr = rxf_stats->rx_drops_no_tpre_descr;
+	drvs->rx_drops_too_many_frags = rxf_stats->rx_drops_too_many_frags;
 	adapter->drv_stats.eth_red_drops = pmem_sts->eth_red_drops;
 }
 
 static void populate_be3_stats(struct be_adapter *adapter)
 {
-	struct be_drv_stats *drvs = &adapter->drv_stats;
-	struct be_pmem_stats *pmem_sts = be_pmem_stats_from_cmd(adapter);
-
-	struct be_rxf_stats_v1 *rxf_stats =
-		be_rxf_stats_from_cmd(adapter);
+	struct be_hw_stats_v1 *hw_stats = hw_stats_from_cmd(adapter);
+	struct be_pmem_stats *pmem_sts = &hw_stats->pmem;
+	struct be_rxf_stats_v1 *rxf_stats = &hw_stats->rxf;
 	struct be_port_rxf_stats_v1 *port_stats =
-		be_port_rxf_stats_from_cmd(adapter);
+					&rxf_stats->port[adapter->port_num];
+	struct be_drv_stats *drvs = &adapter->drv_stats;
 
-	drvs->rx_priority_pause_frames = 0;
-	drvs->pmem_fifo_overflow_drop = 0;
+	be_dws_le_to_cpu(hw_stats, sizeof(*hw_stats));
 	drvs->rx_pause_frames = port_stats->rx_pause_frames;
 	drvs->rx_crc_errors = port_stats->rx_crc_errors;
 	drvs->rx_control_frames = port_stats->rx_control_frames;
@@ -327,12 +319,10 @@ static void populate_be3_stats(struct be_adapter *adapter)
 		port_stats->rx_dropped_header_too_small;
 	drvs->rx_input_fifo_overflow_drop =
 		port_stats->rx_input_fifo_overflow_drop;
-	drvs->rx_address_match_errors =
-		port_stats->rx_address_match_errors;
+	drvs->rx_address_match_errors = port_stats->rx_address_match_errors;
 	drvs->rx_alignment_symbol_errors =
 		port_stats->rx_alignment_symbol_errors;
-	drvs->rxpp_fifo_overflow_drop =
-		port_stats->rxpp_fifo_overflow_drop;
+	drvs->rxpp_fifo_overflow_drop = port_stats->rxpp_fifo_overflow_drop;
 	drvs->tx_pauseframes = port_stats->tx_pauseframes;
 	drvs->tx_controlframes = port_stats->tx_controlframes;
 	drvs->jabber_events = port_stats->jabber_events;
@@ -342,10 +332,8 @@ static void populate_be3_stats(struct be_adapter *adapter)
 	drvs->rx_drops_invalid_ring = rxf_stats->rx_drops_invalid_ring;
 	drvs->forwarded_packets = rxf_stats->forwarded_packets;
 	drvs->rx_drops_mtu = rxf_stats->rx_drops_mtu;
-	drvs->rx_drops_no_tpre_descr =
-		rxf_stats->rx_drops_no_tpre_descr;
-	drvs->rx_drops_too_many_frags =
-		rxf_stats->rx_drops_too_many_frags;
+	drvs->rx_drops_no_tpre_descr = rxf_stats->rx_drops_no_tpre_descr;
+	drvs->rx_drops_too_many_frags = rxf_stats->rx_drops_too_many_frags;
 	adapter->drv_stats.eth_red_drops = pmem_sts->eth_red_drops;
 }
 
@@ -353,22 +341,15 @@ static void populate_lancer_stats(struct be_adapter *adapter)
 {
 
 	struct be_drv_stats *drvs = &adapter->drv_stats;
-	struct lancer_cmd_pport_stats *pport_stats = pport_stats_from_cmd
-						(adapter);
-	drvs->rx_priority_pause_frames = 0;
-	drvs->pmem_fifo_overflow_drop = 0;
-	drvs->rx_pause_frames =
-		make_64bit_val(pport_stats->rx_pause_frames_hi,
-				 pport_stats->rx_pause_frames_lo);
-	drvs->rx_crc_errors = make_64bit_val(pport_stats->rx_crc_errors_hi,
-						pport_stats->rx_crc_errors_lo);
-	drvs->rx_control_frames =
-			make_64bit_val(pport_stats->rx_control_frames_hi,
-			pport_stats->rx_control_frames_lo);
+	struct lancer_pport_stats *pport_stats =
+					pport_stats_from_cmd(adapter);
+
+	be_dws_le_to_cpu(pport_stats, sizeof(*pport_stats));
+	drvs->rx_pause_frames = pport_stats->rx_pause_frames_lo;
+	drvs->rx_crc_errors = pport_stats->rx_crc_errors_lo;
+	drvs->rx_control_frames = pport_stats->rx_control_frames_lo;
 	drvs->rx_in_range_errors = pport_stats->rx_in_range_errors;
-	drvs->rx_frame_too_long =
-		make_64bit_val(pport_stats->rx_internal_mac_errors_hi,
-					pport_stats->rx_frames_too_long_lo);
+	drvs->rx_frame_too_long = pport_stats->rx_frames_too_long_lo;
 	drvs->rx_dropped_runt = pport_stats->rx_dropped_runt;
 	drvs->rx_ip_checksum_errs = pport_stats->rx_ip_checksum_errors;
 	drvs->rx_tcp_checksum_errs = pport_stats->rx_tcp_checksum_errors;
@@ -382,32 +363,24 @@ static void populate_lancer_stats(struct be_adapter *adapter)
 				pport_stats->rx_dropped_header_too_small;
 	drvs->rx_input_fifo_overflow_drop = pport_stats->rx_fifo_overflow;
 	drvs->rx_address_match_errors = pport_stats->rx_address_match_errors;
-	drvs->rx_alignment_symbol_errors =
-		make_64bit_val(pport_stats->rx_symbol_errors_hi,
-				pport_stats->rx_symbol_errors_lo);
+	drvs->rx_alignment_symbol_errors = pport_stats->rx_symbol_errors_lo;
 	drvs->rxpp_fifo_overflow_drop = pport_stats->rx_fifo_overflow;
-	drvs->tx_pauseframes = make_64bit_val(pport_stats->tx_pause_frames_hi,
-					pport_stats->tx_pause_frames_lo);
-	drvs->tx_controlframes =
-		make_64bit_val(pport_stats->tx_control_frames_hi,
-				pport_stats->tx_control_frames_lo);
+	drvs->tx_pauseframes = pport_stats->tx_pause_frames_lo;
+	drvs->tx_controlframes = pport_stats->tx_control_frames_lo;
 	drvs->jabber_events = pport_stats->rx_jabbers;
-	drvs->rx_drops_no_pbuf = 0;
-	drvs->rx_drops_no_txpb = 0;
-	drvs->rx_drops_no_erx_descr = 0;
 	drvs->rx_drops_invalid_ring = pport_stats->rx_drops_invalid_queue;
-	drvs->forwarded_packets = make_64bit_val(pport_stats->num_forwards_hi,
-						pport_stats->num_forwards_lo);
-	drvs->rx_drops_mtu = make_64bit_val(pport_stats->rx_drops_mtu_hi,
-						pport_stats->rx_drops_mtu_lo);
-	drvs->rx_drops_no_tpre_descr = 0;
+	drvs->forwarded_packets = pport_stats->num_forwards_lo;
+	drvs->rx_drops_mtu = pport_stats->rx_drops_mtu_lo;
 	drvs->rx_drops_too_many_frags =
-		make_64bit_val(pport_stats->rx_drops_too_many_frags_hi,
-				pport_stats->rx_drops_too_many_frags_lo);
+				pport_stats->rx_drops_too_many_frags_lo;
 }
 
 void be_parse_stats(struct be_adapter *adapter)
 {
+	struct be_erx_stats_v1 *erx = be_erx_stats_from_cmd(adapter);
+	struct be_rx_obj *rxo;
+	int i;
+
 	if (adapter->generation == BE_GEN3) {
 		if (lancer_chip(adapter))
 			populate_lancer_stats(adapter);
@@ -416,6 +389,11 @@ void be_parse_stats(struct be_adapter *adapter)
 	} else {
 		populate_be2_stats(adapter);
 	}
+
+	/* as erx_v1 is longer than v0, ok to use v1 defn for v0 access */
+	for_all_rx_queues(adapter, rxo, i)
+		rx_stats(rxo)->rx_drops_no_frags =
+			erx->rx_drops_no_fragments[rxo->q.id];
 }
 
 void netdev_stats_update(struct be_adapter *adapter)
@@ -431,19 +409,7 @@ void netdev_stats_update(struct be_adapter *adapter)
 		pkts += rx_stats(rxo)->rx_pkts;
 		bytes += rx_stats(rxo)->rx_bytes;
 		mcast += rx_stats(rxo)->rx_mcast_pkts;
-		drops += rx_stats(rxo)->rx_dropped;
-		/*  no space in linux buffers: best possible approximation */
-		if (adapter->generation == BE_GEN3) {
-			if (!(lancer_chip(adapter))) {
-				struct be_erx_stats_v1 *erx =
-					be_erx_stats_from_cmd(adapter);
-				drops += erx->rx_drops_no_fragments[rxo->q.id];
-			}
-		} else {
-			struct be_erx_stats_v0 *erx =
-					be_erx_stats_from_cmd(adapter);
-			drops += erx->rx_drops_no_fragments[rxo->q.id];
-		}
+		drops += rx_stats(rxo)->rx_drops_no_skbs;
 	}
 	dev_stats->rx_packets = pkts;
 	dev_stats->rx_bytes = bytes;
@@ -452,8 +418,8 @@ void netdev_stats_update(struct be_adapter *adapter)
 
 	pkts = bytes = 0;
 	for_all_tx_queues(adapter, txo, i) {
-		pkts += tx_stats(txo)->be_tx_pkts;
-		bytes += tx_stats(txo)->be_tx_bytes;
+		pkts += tx_stats(txo)->tx_pkts;
+		bytes += tx_stats(txo)->tx_bytes;
 	}
 	dev_stats->tx_packets = pkts;
 	dev_stats->tx_bytes = bytes;
@@ -508,89 +474,17 @@ void be_link_status_update(struct be_adapter *adapter, bool link_up)
 	}
 }
 
-/* Update the EQ delay n BE based on the RX frags consumed / sec */
-static void be_rx_eqd_update(struct be_adapter *adapter, struct be_rx_obj *rxo)
-{
-	struct be_eq_obj *rx_eq = &rxo->rx_eq;
-	struct be_rx_stats *stats = &rxo->stats;
-	ulong now = jiffies;
-	u32 eqd;
-
-	if (!rx_eq->enable_aic)
-		return;
-
-	/* Wrapped around */
-	if (time_before(now, stats->rx_fps_jiffies)) {
-		stats->rx_fps_jiffies = now;
-		return;
-	}
-
-	/* Update once a second */
-	if ((now - stats->rx_fps_jiffies) < HZ)
-		return;
-
-	stats->rx_fps = (stats->rx_frags - stats->prev_rx_frags) /
-			((now - stats->rx_fps_jiffies) / HZ);
-
-	stats->rx_fps_jiffies = now;
-	stats->prev_rx_frags = stats->rx_frags;
-	eqd = stats->rx_fps / 110000;
-	eqd = eqd << 3;
-	if (eqd > rx_eq->max_eqd)
-		eqd = rx_eq->max_eqd;
-	if (eqd < rx_eq->min_eqd)
-		eqd = rx_eq->min_eqd;
-	if (eqd < 10)
-		eqd = 0;
-	if (eqd != rx_eq->cur_eqd)
-		be_cmd_modify_eqd(adapter, rx_eq->q.id, eqd);
-
-	rx_eq->cur_eqd = eqd;
-}
-
-static u32 be_calc_rate(u64 bytes, unsigned long ticks)
-{
-	u64 rate = bytes;
-
-	do_div(rate, ticks / HZ);
-	rate <<= 3;			/* bytes/sec -> bits/sec */
-	do_div(rate, 1000000ul);	/* MB/Sec */
-
-	return rate;
-}
-
-static void be_tx_rate_update(struct be_tx_obj *txo)
-{
-	struct be_tx_stats *stats = tx_stats(txo);
-	ulong now = jiffies;
-
-	/* Wrapped around? */
-	if (time_before(now, stats->be_tx_jiffies)) {
-		stats->be_tx_jiffies = now;
-		return;
-	}
-
-	/* Update tx rate once in two seconds */
-	if ((now - stats->be_tx_jiffies) > 2 * HZ) {
-		stats->be_tx_rate = be_calc_rate(stats->be_tx_bytes
-						  - stats->be_tx_bytes_prev,
-						 now - stats->be_tx_jiffies);
-		stats->be_tx_jiffies = now;
-		stats->be_tx_bytes_prev = stats->be_tx_bytes;
-	}
-}
-
 static void be_tx_stats_update(struct be_tx_obj *txo,
 			u32 wrb_cnt, u32 copied, u32 gso_segs, bool stopped)
 {
 	struct be_tx_stats *stats = tx_stats(txo);
 
-	stats->be_tx_reqs++;
-	stats->be_tx_wrbs += wrb_cnt;
-	stats->be_tx_bytes += copied;
-	stats->be_tx_pkts += (gso_segs ? gso_segs : 1);
+	stats->tx_reqs++;
+	stats->tx_wrbs += wrb_cnt;
+	stats->tx_bytes += copied;
+	stats->tx_pkts += (gso_segs ? gso_segs : 1);
 	if (stopped)
-		stats->be_tx_stops++;
+		stats->tx_stops++;
 }
 
 /* Determine number of WRB entries needed to xmit data in an skb */
@@ -1005,10 +899,16 @@ static int be_set_vf_tx_rate(struct net_device *netdev,
 	return status;
 }
 
-static void be_rx_rate_update(struct be_rx_obj *rxo)
+static void be_rx_eqd_update(struct be_adapter *adapter, struct be_rx_obj *rxo)
 {
-	struct be_rx_stats *stats = &rxo->stats;
+	struct be_eq_obj *rx_eq = &rxo->rx_eq;
+	struct be_rx_stats *stats = rx_stats(rxo);
 	ulong now = jiffies;
+	ulong delta = now - stats->rx_jiffies;
+	u32 eqd;
+
+	if (!rx_eq->enable_aic)
+		return;
 
 	/* Wrapped around */
 	if (time_before(now, stats->rx_jiffies)) {
@@ -1016,29 +916,39 @@ static void be_rx_rate_update(struct be_rx_obj *rxo)
 		return;
 	}
 
-	/* Update the rate once in two seconds */
-	if ((now - stats->rx_jiffies) < 2 * HZ)
+	/* Update once a second */
+	if (delta < HZ)
 		return;
 
-	stats->rx_rate = be_calc_rate(stats->rx_bytes - stats->rx_bytes_prev,
-				now - stats->rx_jiffies);
+	stats->rx_pps = (stats->rx_pkts - stats->rx_pkts_prev) / (delta / HZ);
+	stats->rx_pkts_prev = stats->rx_pkts;
 	stats->rx_jiffies = now;
-	stats->rx_bytes_prev = stats->rx_bytes;
+	eqd = stats->rx_pps / 110000;
+	eqd = eqd << 3;
+	if (eqd > rx_eq->max_eqd)
+		eqd = rx_eq->max_eqd;
+	if (eqd < rx_eq->min_eqd)
+		eqd = rx_eq->min_eqd;
+	if (eqd < 10)
+		eqd = 0;
+	if (eqd != rx_eq->cur_eqd) {
+		be_cmd_modify_eqd(adapter, rx_eq->q.id, eqd);
+		rx_eq->cur_eqd = eqd;
+	}
 }
 
 static void be_rx_stats_update(struct be_rx_obj *rxo,
 		struct be_rx_compl_info *rxcp)
 {
-	struct be_rx_stats *stats = &rxo->stats;
+	struct be_rx_stats *stats = rx_stats(rxo);
 
 	stats->rx_compl++;
-	stats->rx_frags += rxcp->num_rcvd;
 	stats->rx_bytes += rxcp->pkt_size;
 	stats->rx_pkts++;
 	if (rxcp->pkt_type == BE_MULTICAST_PACKET)
 		stats->rx_mcast_pkts++;
 	if (rxcp->err)
-		stats->rxcp_err++;
+		stats->rx_compl_err++;
 }
 
 static inline bool csum_passed(struct be_rx_compl_info *rxcp)
@@ -1174,7 +1084,7 @@ static void be_rx_compl_process(struct be_adapter *adapter,
 
 	skb = netdev_alloc_skb_ip_align(netdev, BE_HDR_LEN);
 	if (unlikely(!skb)) {
-		rxo->stats.rx_dropped++;
+		rx_stats(rxo)->rx_drops_no_skbs++;
 		be_rx_compl_discard(adapter, rxo, rxcp);
 		return;
 	}
@@ -1389,7 +1299,7 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp)
 		if (!pagep) {
 			pagep = be_alloc_pages(adapter->big_page_size, gfp);
 			if (unlikely(!pagep)) {
-				rxo->stats.rx_post_fail++;
+				rx_stats(rxo)->rx_post_fail++;
 				break;
 			}
 			page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep,
@@ -1899,7 +1809,7 @@ static int be_poll_rx(struct napi_struct *napi, int budget)
 	struct be_rx_compl_info *rxcp;
 	u32 work_done;
 
-	rxo->stats.rx_polls++;
+	rx_stats(rxo)->rx_polls++;
 	for (work_done = 0; work_done < budget; work_done++) {
 		rxcp = be_rx_compl_get(rxo);
 		if (!rxcp)
@@ -1968,8 +1878,8 @@ static int be_poll_tx_mcc(struct napi_struct *napi, int budget)
 				netif_wake_subqueue(adapter->netdev, i);
 			}
 
-			adapter->drv_stats.be_tx_events++;
-			txo->stats.be_tx_compl += tx_compl;
+			adapter->drv_stats.tx_events++;
+			tx_stats(txo)->tx_compl += tx_compl;
 		}
 	}
 
@@ -2031,7 +1941,6 @@ static void be_worker(struct work_struct *work)
 	struct be_adapter *adapter =
 		container_of(work, struct be_adapter, work.work);
 	struct be_rx_obj *rxo;
-	struct be_tx_obj *txo;
 	int i;
 
 	if (!adapter->ue_detected && !lancer_chip(adapter))
@@ -2060,11 +1969,7 @@ static void be_worker(struct work_struct *work)
 			be_cmd_get_stats(adapter, &adapter->stats_cmd);
 	}
 
-	for_all_tx_queues(adapter, txo, i)
-		be_tx_rate_update(txo);
-
 	for_all_rx_queues(adapter, rxo, i) {
-		be_rx_rate_update(rxo);
 		be_rx_eqd_update(adapter, rxo);
 
 		if (rxo->rx_post_starved) {
-- 
1.7.4


^ permalink raw reply related

* [PATCH net-next-2.6 0/2] be2net: 64-bit stats
From: Sathya Perla @ 2011-07-26  5:10 UTC (permalink / raw)
  To: netdev

Pls apply.

Sathya Perla (2):
  be2net: cleanup and refactor stats code
  be2net: use stats-sync to read/write 64-bit stats

 drivers/net/benet/be.h         |  115 +++++++--------
 drivers/net/benet/be_cmds.c    |   21 ---
 drivers/net/benet/be_cmds.h    |   53 +------
 drivers/net/benet/be_ethtool.c |  122 +++++++---------
 drivers/net/benet/be_main.c    |  326 +++++++++++++++-------------------------
 5 files changed, 239 insertions(+), 398 deletions(-)

-- 
1.7.4


^ permalink raw reply

* Re: [PATCH] net/smsc911x: add device tree probe support
From: Nicolas Pitre @ 2011-07-26  2:28 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Grant Likely, patches, netdev, devicetree-discuss,
	Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110726013026.GH21641@S2100-06.ap.freescale.net>

On Tue, 26 Jul 2011, Shawn Guo wrote:

> On Mon, Jul 25, 2011 at 09:16:40PM -0400, Nicolas Pitre wrote:
> > On Tue, 26 Jul 2011, Shawn Guo wrote:
> > 
> > > On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > > > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > > > It adds device tree probe support for smsc911x driver.
> > > > > 
> > > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > > > Cc: Grant Likely <grant.likely@secretlab.ca>
> > > > > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > ---
> > > > >  Documentation/devicetree/bindings/net/smsc.txt |   34 +++++++
> > > > >  drivers/net/smsc911x.c                         |  123 +++++++++++++++++++-----
> > > > >  2 files changed, 132 insertions(+), 25 deletions(-)
> > > > >  create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > > > 
> > > > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > new file mode 100644
> > > > > index 0000000..1920695
> > > > > --- /dev/null
> > > > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > > > @@ -0,0 +1,34 @@
> > > > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > > > +
> > > > > +Required properties:
> > > > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > > > 
> > > > Drop "smsc,lan".  That's far too generic.
> > > > 
> > > The following devices are supported by the driver.
> > > 
> > > LAN9115, LAN9116, LAN9117, LAN9118
> > > LAN9215, LAN9216, LAN9217, LAN9218
> > > LAN9210, LAN9211
> > > LAN9220, LAN9221
> > > 
> > > If we only keep specific <model> as the compatible, we will have a
> > > long match table which is actually used nowhere to distinguish the
> > > device.
> > > 
> > > So we need some level generic compatible to save the meaningless
> > > long match table.  What about: 
> > > 
> > > static const struct of_device_id smsc_dt_ids[] = {
> > >         { .compatible = "smsc,lan9", },
> > >         { /* sentinel */ }
> > > };
> > > 
> > > Or:
> > > 
> > > static const struct of_device_id smsc_dt_ids[] = {
> > >         { .compatible = "smsc,lan91", },
> > >         { .compatible = "smsc,lan92", },
> > >         { /* sentinel */ }
> > > };
> > 
> > None of this unambiguously distinguish the devices supported by this 
> > driver and the smc91x driver which supports LAN91C92, LAN91C94, 
> > LAN91C95, LAN91C96, LAN91C100, LAN91C110.
> > 
> So you suggest to make a long list to explicitly tell the device type
> that the driver supports?

I'm not suggesting anything.  :-)  I'm merely pointing out that the 
above .compatible = "smsc,lan9" or .compatible = "smsc,lan91" are too 
generic given that there is another driver with different devices to 
which they could also apply.


Nicolas

^ permalink raw reply

* Re: [PATCH] net/smsc911x: add device tree probe support
From: Shawn Guo @ 2011-07-26  1:30 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: patches-QSEj5FYQhm4dnm+yROfE0A, netdev-u79uwXL29TY76Z2rM5mHXA,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ, Steve Glendinning,
	David S. Miller,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <alpine.LFD.2.00.1107252105561.12766-QuJgVwGFrdf/9pzu0YdTqQ@public.gmane.org>

On Mon, Jul 25, 2011 at 09:16:40PM -0400, Nicolas Pitre wrote:
> On Tue, 26 Jul 2011, Shawn Guo wrote:
> 
> > On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > > It adds device tree probe support for smsc911x driver.
> > > > 
> > > > Signed-off-by: Shawn Guo <shawn.guo-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> > > > Cc: Grant Likely <grant.likely-s3s/WqlpOiPyB63q8FvJNQ@public.gmane.org>
> > > > Cc: Steve Glendinning <steve.glendinning-sdUf+H5yV5I@public.gmane.org>
> > > > Cc: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> > > > ---
> > > >  Documentation/devicetree/bindings/net/smsc.txt |   34 +++++++
> > > >  drivers/net/smsc911x.c                         |  123 +++++++++++++++++++-----
> > > >  2 files changed, 132 insertions(+), 25 deletions(-)
> > > >  create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > > 
> > > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > > new file mode 100644
> > > > index 0000000..1920695
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > > @@ -0,0 +1,34 @@
> > > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > > +
> > > > +Required properties:
> > > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > > 
> > > Drop "smsc,lan".  That's far too generic.
> > > 
> > The following devices are supported by the driver.
> > 
> > LAN9115, LAN9116, LAN9117, LAN9118
> > LAN9215, LAN9216, LAN9217, LAN9218
> > LAN9210, LAN9211
> > LAN9220, LAN9221
> > 
> > If we only keep specific <model> as the compatible, we will have a
> > long match table which is actually used nowhere to distinguish the
> > device.
> > 
> > So we need some level generic compatible to save the meaningless
> > long match table.  What about: 
> > 
> > static const struct of_device_id smsc_dt_ids[] = {
> >         { .compatible = "smsc,lan9", },
> >         { /* sentinel */ }
> > };
> > 
> > Or:
> > 
> > static const struct of_device_id smsc_dt_ids[] = {
> >         { .compatible = "smsc,lan91", },
> >         { .compatible = "smsc,lan92", },
> >         { /* sentinel */ }
> > };
> 
> None of this unambiguously distinguish the devices supported by this 
> driver and the smc91x driver which supports LAN91C92, LAN91C94, 
> LAN91C95, LAN91C96, LAN91C100, LAN91C110.
> 
So you suggest to make a long list to explicitly tell the device type
that the driver supports?

-- 
Regards,
Shawn

^ permalink raw reply

* Re: [PATCH] net/smsc911x: add device tree probe support
From: Nicolas Pitre @ 2011-07-26  1:16 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Grant Likely, patches, netdev, devicetree-discuss,
	Steve Glendinning, Shawn Guo, David S. Miller, linux-arm-kernel
In-Reply-To: <20110726010154.GG21641@S2100-06.ap.freescale.net>

On Tue, 26 Jul 2011, Shawn Guo wrote:

> On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> > On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > > It adds device tree probe support for smsc911x driver.
> > > 
> > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > Cc: Grant Likely <grant.likely@secretlab.ca>
> > > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > > Cc: David S. Miller <davem@davemloft.net>
> > > ---
> > >  Documentation/devicetree/bindings/net/smsc.txt |   34 +++++++
> > >  drivers/net/smsc911x.c                         |  123 +++++++++++++++++++-----
> > >  2 files changed, 132 insertions(+), 25 deletions(-)
> > >  create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > > 
> > > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > > new file mode 100644
> > > index 0000000..1920695
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > > @@ -0,0 +1,34 @@
> > > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > > +
> > > +Required properties:
> > > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> > 
> > Drop "smsc,lan".  That's far too generic.
> > 
> The following devices are supported by the driver.
> 
> LAN9115, LAN9116, LAN9117, LAN9118
> LAN9215, LAN9216, LAN9217, LAN9218
> LAN9210, LAN9211
> LAN9220, LAN9221
> 
> If we only keep specific <model> as the compatible, we will have a
> long match table which is actually used nowhere to distinguish the
> device.
> 
> So we need some level generic compatible to save the meaningless
> long match table.  What about: 
> 
> static const struct of_device_id smsc_dt_ids[] = {
>         { .compatible = "smsc,lan9", },
>         { /* sentinel */ }
> };
> 
> Or:
> 
> static const struct of_device_id smsc_dt_ids[] = {
>         { .compatible = "smsc,lan91", },
>         { .compatible = "smsc,lan92", },
>         { /* sentinel */ }
> };

None of this unambiguously distinguish the devices supported by this 
driver and the smc91x driver which supports LAN91C92, LAN91C94, 
LAN91C95, LAN91C96, LAN91C100, LAN91C110.


Nicolas

^ permalink raw reply

* Re: [PATCH] net/smsc911x: add device tree probe support
From: Shawn Guo @ 2011-07-26  1:01 UTC (permalink / raw)
  To: Grant Likely
  Cc: patches, netdev, devicetree-discuss, Steve Glendinning, Shawn Guo,
	David S. Miller, linux-arm-kernel
In-Reply-To: <20110725213723.GI26735@ponder.secretlab.ca>

On Mon, Jul 25, 2011 at 03:37:23PM -0600, Grant Likely wrote:
> On Mon, Jul 25, 2011 at 05:44:00PM +0800, Shawn Guo wrote:
> > It adds device tree probe support for smsc911x driver.
> > 
> > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > Cc: Grant Likely <grant.likely@secretlab.ca>
> > Cc: Steve Glendinning <steve.glendinning@smsc.com>
> > Cc: David S. Miller <davem@davemloft.net>
> > ---
> >  Documentation/devicetree/bindings/net/smsc.txt |   34 +++++++
> >  drivers/net/smsc911x.c                         |  123 +++++++++++++++++++-----
> >  2 files changed, 132 insertions(+), 25 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/net/smsc.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/net/smsc.txt b/Documentation/devicetree/bindings/net/smsc.txt
> > new file mode 100644
> > index 0000000..1920695
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/smsc.txt
> > @@ -0,0 +1,34 @@
> > +* Smart Mixed-Signal Connectivity (SMSC) LAN Controller
> > +
> > +Required properties:
> > +- compatible : Should be "smsc,lan<model>""smsc,lan"
> 
> Drop "smsc,lan".  That's far too generic.
> 
The following devices are supported by the driver.

LAN9115, LAN9116, LAN9117, LAN9118
LAN9215, LAN9216, LAN9217, LAN9218
LAN9210, LAN9211
LAN9220, LAN9221

If we only keep specific <model> as the compatible, we will have a
long match table which is actually used nowhere to distinguish the
device.

So we need some level generic compatible to save the meaningless
long match table.  What about: 

static const struct of_device_id smsc_dt_ids[] = {
        { .compatible = "smsc,lan9", },
        { /* sentinel */ }
};

Or:

static const struct of_device_id smsc_dt_ids[] = {
        { .compatible = "smsc,lan91", },
        { .compatible = "smsc,lan92", },
        { /* sentinel */ }
};

> > +- reg : Address and length of the io space for SMSC LAN
> > +- smsc-int-gpios : Should specify the GPIO for SMSC LAN interrupt line
> 
> This looks broken.  Shouldn't this be specified as a normal
> "interrupts" property?
> 
> > +- phy-mode : String, operation mode of the PHY interface.
> > +  Supported values are: "mii", "gmii", "sgmii", "tbi", "rmii",
> > +  "rgmii", "rgmii-id", "rgmii-rxid", "rgmii-txid", "rtbi", "smii".
> > +
> > +Optional properties:
> > +- smsc,irq-active-high : Indicates the IRQ polarity is active-low
> > +- smsc,irq-push-pull : Indicates the IRQ type is push-pull
> > +- smsc,register-needs-shift : Indicates the register access needs shift
> > +- smsc,access-in-32bit : Indicates the access to controller is in 32-bit
> > +  mode
> 
> Currently, reg-io-width and reg-shift are being used to manipulate
> register access on ns16550 serial ports.  The same thing can be used
> here.  See bindings/tty/serial/of-serial.txt
> 
They are not exactly same.  reg-io-width and reg-shift in of-serial.txt
are giving a number, while register-needs-shift and access-in-32bit
here just tells a flag.  But if you have a strong position to make
them consistent, I can change them to reg-io-width and reg-shift and
get smsc911x parse numbers instead of flags.

> 
> > +- smsc,force-internal-phy : Forces SMSC LAN controller to use
> > +  internal PHY
> > +- smsc,force-external-phy : Forces SMSC LAN controller to use
> > +  external PHY
> 
> I would expect using an external phy would also expect a phy-device
> property to connect to the phy node.
> 
I do not understand the details.  But from the comment below in the
code, I guess the "external phy" is not external?

/* Autodetects and enables external phy if present on supported chips.
 * autodetection can be overridden by specifying SMSC911X_FORCE_INTERNAL_PHY
 * or SMSC911X_FORCE_EXTERNAL_PHY in the platform_data flags. */

> > +- smsc,save-mac-address : Indicates that mac address needs to be saved
> > +  before resetting the controller
> > +- local-mac-address : 6 bytes, mac address
> > +
> > +Examples:
> > +
> > +lan9220@f4000000 {
> > +	compatible = "smsc,lan9220", "smsc,lan";
> > +	reg = <0xf4000000 0x2000000>;
> > +	phy-mode = "mii";
> > +	smsc-int-gpios = <&gpio1 31 0>; /* GPIO2_31 */
> > +	smsc,irq-push-pull;
> > +	smsc,access-in-32bit;
> > +};
> > diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
> > index b9016a3..0097048 100644
> > --- a/drivers/net/smsc911x.c
> > +++ b/drivers/net/smsc911x.c
> > @@ -53,6 +53,10 @@
> >  #include <linux/phy.h>
> >  #include <linux/smsc911x.h>
> >  #include <linux/device.h>
> > +#include <linux/of.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of_gpio.h>
> > +#include <linux/of_net.h>
> >  #include "smsc911x.h"
> >  
> >  #define SMSC_CHIPNAME		"smsc911x"
> > @@ -2095,25 +2099,67 @@ static const struct smsc911x_ops shifted_smsc911x_ops = {
> >  	.tx_writefifo = smsc911x_tx_writefifo_shift,
> >  };
> >  
> > +#ifdef CONFIG_OF
> > +static int __devinit smsc911x_probe_config_dt(
> > +				struct smsc911x_platform_config *config,
> > +				struct device_node *np)
> > +{
> > +	const char *mac;
> > +
> > +	if (!np)
> > +		return -ENODEV;
> > +
> > +	config->phy_interface = of_get_phy_mode(np);
> > +
> > +	mac = of_get_mac_address(np);
> > +	if (mac)
> > +		memcpy(config->mac, mac, ETH_ALEN);
> > +
> > +	if (of_get_property(np, "smsc,irq-active-high", NULL))
> > +		config->irq_polarity = SMSC911X_IRQ_POLARITY_ACTIVE_HIGH;
> > +
> > +	if (of_get_property(np, "smsc,irq-push-pull", NULL))
> > +		config->irq_type = SMSC911X_IRQ_TYPE_PUSH_PULL;
> > +
> > +	if (of_get_property(np, "smsc,register-needs-shift", NULL))
> > +		config->shift = 1;
> > +
> > +	if (of_get_property(np, "smsc,access-in-32bit", NULL))
> > +		config->flags |= SMSC911X_USE_32BIT;
> > +
> > +	if (of_get_property(np, "smsc,force-internal-phy", NULL))
> > +		config->flags |= SMSC911X_FORCE_INTERNAL_PHY;
> > +
> > +	if (of_get_property(np, "smsc,force-external-phy", NULL))
> > +		config->flags |= SMSC911X_FORCE_EXTERNAL_PHY;
> > +
> > +	if (of_get_property(np, "smsc,save-mac-address", NULL))
> > +		config->flags |= SMSC911X_SAVE_MAC_ADDRESS;
> > +
> > +	return 0;
> > +}
> > +#else
> > +static inline int smsc911x_probe_config_dt(
> > +				struct smsc911x_platform_config *config,
> > +				struct device_node *np)
> > +{
> > +	return -ENODEV;
> > +}
> > +#endif /* CONFIG_OF */
> > +
> >  static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> >  {
> > +	struct device_node *np = pdev->dev.of_node;
> >  	struct net_device *dev;
> >  	struct smsc911x_data *pdata;
> >  	struct smsc911x_platform_config *config = pdev->dev.platform_data;
> >  	struct resource *res, *irq_res;
> >  	unsigned int intcfg = 0;
> > -	int res_size, irq_flags;
> > -	int retval;
> > +	int irq_gpio, res_size, irq_flags = 0;
> > +	int retval = 0;
> >  
> >  	pr_info("Driver version %s\n", SMSC_DRV_VERSION);
> >  
> > -	/* platform data specifies irq & dynamic bus configuration */
> > -	if (!pdev->dev.platform_data) {
> > -		pr_warn("platform_data not provided\n");
> > -		retval = -ENODEV;
> > -		goto out_0;
> > -	}
> > -
> >  	res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
> >  					   "smsc911x-memory");
> >  	if (!res)
> > @@ -2125,13 +2171,6 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> >  	}
> >  	res_size = resource_size(res);
> >  
> > -	irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> > -	if (!irq_res) {
> > -		pr_warn("Could not allocate irq resource\n");
> > -		retval = -ENODEV;
> > -		goto out_0;
> > -	}
> > -
> 
> This should still work for the device-tree situation.  Why remove it?
> 
> >  	if (!request_mem_region(res->start, res_size, SMSC_CHIPNAME)) {
> >  		retval = -EBUSY;
> >  		goto out_0;
> > @@ -2148,26 +2187,53 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
> >  
> >  	pdata = netdev_priv(dev);
> >  
> > -	dev->irq = irq_res->start;
> > -	irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
> > -	pdata->ioaddr = ioremap_nocache(res->start, res_size);
> > -
> > -	/* copy config parameters across to pdata */
> > -	memcpy(&pdata->config, config, sizeof(pdata->config));
> > +	if (np) {
> > +		irq_gpio = of_get_named_gpio(np, "smsc-int-gpios", 0);
> > +		retval = gpio_request_one(irq_gpio, GPIOF_IN, "smsc-int-gpio");
> > +		if (!retval)
> > +			dev->irq = gpio_to_irq(irq_gpio);
> 
> Yeah, that's definitely the wrong way to handle this.  If the
> device it wired to a gpio controller, then the gpio controller also
> need to be an interrupt controller to ensure that it can map interrupt
> numbers.
> 

Here it is.  Please let me know if it is what you mean.

diff --git a/arch/arm/boot/dts/imx53-ard.dts b/arch/arm/boot/dts/imx53-ard.dts
index 6a007f1..0b17af8 100644
--- a/arch/arm/boot/dts/imx53-ard.dts
+++ b/arch/arm/boot/dts/imx53-ard.dts
@@ -320,7 +320,8 @@
 			compatible = "smsc,lan9220", "smsc,lan";
 			reg = <0xf4000000 0x2000000>;
 			phy-mode = "mii";
-			smsc-int-gpios = <&gpio1 31 0>; /* GPIO2_31 */
+			interrupt-parent = <&gpio1>;
+			interrupts = <31>;
 			smsc,irq-push-pull;
 			smsc,access-in-32bit;
 		};
diff --git a/arch/arm/boot/dts/imx53.dtsi b/arch/arm/boot/dts/imx53.dtsi
index 746221c..224f67f 100644
--- a/arch/arm/boot/dts/imx53.dtsi
+++ b/arch/arm/boot/dts/imx53.dtsi
@@ -89,6 +89,8 @@
 			interrupts = <50 51>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		gpio1: gpio@53f88000 { /* GPIO2 */
@@ -97,6 +99,8 @@
 			interrupts = <52 53>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		gpio2: gpio@53f8c000 { /* GPIO3 */
@@ -105,6 +109,8 @@
 			interrupts = <54 55>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		gpio3: gpio@53f90000 { /* GPIO4 */
@@ -113,6 +119,8 @@
 			interrupts = <56 57>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		wdt@53f98000 { /* WDOG1 */
@@ -950,6 +958,8 @@
 			interrupts = <103 104>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		gpio5: gpio@53fe0000 { /* GPIO6 */
@@ -958,6 +968,8 @@
 			interrupts = <105 106>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		gpio6: gpio@53fe4000 { /* GPIO7 */
@@ -966,6 +978,8 @@
 			interrupts = <107 108>;
 			gpio-controller;
 			#gpio-cells = <2>;
+			interrupt-controller;
+			#interrupt-cells = <1>;
 		};
 
 		i2c@53fec000 { /* I2C3 */
diff --git a/arch/arm/mach-mx5/imx53-dt.c b/arch/arm/mach-mx5/imx53-dt.c
index ac06f04..04dc1ab 100644
--- a/arch/arm/mach-mx5/imx53-dt.c
+++ b/arch/arm/mach-mx5/imx53-dt.c
@@ -51,6 +51,11 @@ static const struct of_device_id imx53_tzic_of_match[] __initconst = {
 	{ /* sentinel */ }
 };
 
+static const struct of_device_id imx53_gpio_of_match[] __initconst = {
+	{ .compatible = "fsl,imx53-gpio", },
+	{ /* sentinel */ }
+};
+
 static const struct of_device_id imx53_iomuxc_of_match[] __initconst = {
 	{ .compatible = "fsl,imx53-iomuxc", },
 	{ /* sentinel */ }
@@ -90,12 +95,28 @@ static void __init imx53_ard_eim_config(void)
 
 static void __init imx53_dt_init(void)
 {
+	int gpio_irq = MXC_INTERNAL_IRQS + ARCH_NR_GPIOS;
+
 	if (of_machine_is_compatible("fsl,imx53-ard"))
 		imx53_ard_eim_config();
 
 	mxc_iomuxc_dt_init(imx53_iomuxc_of_match);
 
 	irq_domain_generate_simple(imx53_tzic_of_match, MX53_TZIC_BASE_ADDR, 0);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO1_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO2_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO3_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO4_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO5_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO6_BASE_ADDR, gpio_irq);
+	gpio_irq -= 32;
+	irq_domain_generate_simple(imx53_gpio_of_match, MX53_GPIO7_BASE_ADDR, gpio_irq);
 
 	of_platform_populate(NULL, of_default_bus_match_table,
 			     imx53_auxdata_lookup, NULL);
diff --git a/drivers/net/smsc911x.c b/drivers/net/smsc911x.c
index 0097048..6dd025e 100644
--- a/drivers/net/smsc911x.c
+++ b/drivers/net/smsc911x.c
@@ -2187,25 +2187,10 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 
 	pdata = netdev_priv(dev);
 
-	if (np) {
-		irq_gpio = of_get_named_gpio(np, "smsc-int-gpios", 0);
-		retval = gpio_request_one(irq_gpio, GPIOF_IN, "smsc-int-gpio");
-		if (!retval)
-			dev->irq = gpio_to_irq(irq_gpio);
-	} else {
-		irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
-		if (irq_res) {
-			dev->irq = irq_res->start;
-			irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
-		} else {
-			retval = -ENODEV;
-		}
-	}
-
-	if (retval) {
-		SMSC_WARN(pdata, probe, "Error smsc911x irq not found");
-		retval = -EINVAL;
-		goto out_free_netdev_2;
+	irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+	if (irq_res) {
+		dev->irq = irq_res->start;
+		irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
 	}
 
 	pdata->ioaddr = ioremap_nocache(res->start, res_size);


> > +	} else {
> > +		irq_res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> > +		if (irq_res) {
> > +			dev->irq = irq_res->start;
> > +			irq_flags = irq_res->flags & IRQF_TRIGGER_MASK;
> > +		} else {
> > +			retval = -ENODEV;
> > +		}
> > +	}
> >  
> > -	pdata->dev = dev;
> > -	pdata->msg_enable = ((1 << debug) - 1);
> > +	if (retval) {
> > +		SMSC_WARN(pdata, probe, "Error smsc911x irq not found");
> > +		retval = -EINVAL;
> > +		goto out_free_netdev_2;
> > +	}
> >  
> > +	pdata->ioaddr = ioremap_nocache(res->start, res_size);
> >  	if (pdata->ioaddr == NULL) {
> >  		SMSC_WARN(pdata, probe, "Error smsc911x base address invalid");
> >  		retval = -ENOMEM;
> >  		goto out_free_netdev_2;
> >  	}
> >  
> > +	pdata->dev = dev;
> > +	pdata->msg_enable = ((1 << debug) - 1);
> > +
> > +	retval = smsc911x_probe_config_dt(&pdata->config, np);
> > +	if (retval && config) {
> > +		/* copy config parameters across to pdata */
> > +		memcpy(&pdata->config, config, sizeof(pdata->config));
> > +		retval = 0;
> > +	}
> > +
> > +	if (retval) {
> > +		SMSC_WARN(pdata, probe, "Error smsc911x config not found");
> > +		goto out_unmap_io_3;
> > +	}
> > +
> >  	/* assume standard, non-shifted, access to HW registers */
> >  	pdata->ops = &standard_smsc911x_ops;
> >  	/* apply the right access if shifting is needed */
> > -	if (config->shift)
> > +	if (pdata->config.shift)
> >  		pdata->ops = &shifted_smsc911x_ops;
> >  
> >  	retval = smsc911x_init(dev);
> > @@ -2314,6 +2380,12 @@ static const struct dev_pm_ops smsc911x_pm_ops = {
> >  #define SMSC911X_PM_OPS NULL
> >  #endif
> >  
> > +static const struct of_device_id smsc_dt_ids[] = {
> > +	{ .compatible = "smsc,lan", },
> 
> As mentioned above, "smsc,lan" is far too generic.
> 
Again

static const struct of_device_id smsc_dt_ids[] = {
        { .compatible = "smsc,lan9", },
        { /* sentinel */ }
};

Or:

static const struct of_device_id smsc_dt_ids[] = {
        { .compatible = "smsc,lan91", },
        { .compatible = "smsc,lan92", },
        { /* sentinel */ }
};

You pick :)

-- 
Regards,
Shawn

^ permalink raw reply related

* (unknown), 
From: WEBMAIL MANAGEMENT SERVICE! @ 2011-07-26  0:06 UTC (permalink / raw)





Dear Webmail Subscribers,

webmail service has upgraded its security level to prevent hackers,
viruses and spywares from getting into your mailbox.

In order to complete this security update, We encourage you to clik on
this link just to upgrad your webmail account

https://spreadsheets.google.com/a/blumail.org/spreadsheet/viewform?formkey=dDBVM0lvTGxkcmotQm5JQ280T1VrVEE6MQ

We hope you'll enjoy our approach to webemail service.

Please don't reply directly to this automatically-generated e-mail message.

Sincerely,

WEBMAIL MANAGEMENT SERVICE!

^ permalink raw reply

* Re: tcp/udp checksum on loopback interface
From: Chris Friesen @ 2011-07-26  0:04 UTC (permalink / raw)
  To: Pierre Louis Aublin; +Cc: netdev
In-Reply-To: <4E292E51.4040802@inria.fr>

On 07/22/2011 02:01 AM, Pierre Louis Aublin wrote:
> Hello everybody
>
> I am interested in the reliability of TCP and UDP using the loopback
> interface.
> I found that there is no checksum verification on the body of packets
> transmitted through the loopback interface :

> Finally, why this behaviour? Is it because you assume message can not
> get corrupted while staying on the same machine?

That's correct.  We can save cpu time by not doing the checksum because 
we assume that our own hardware won't introduce errors (or if it does 
and we care about them we'll be monitoring the hardware for ECC errors 
anyways).

Chris

-- 
Chris Friesen
Software Developer
GENBAND
chris.friesen@genband.com
www.genband.com

^ permalink raw reply

* Re: [PATCH 1/1] IPv4: Send gratuitous ARP for secondary IP addresses also
From: David Miller @ 2011-07-25 23:16 UTC (permalink / raw)
  To: schaman; +Cc: kuznet, pekkas, jmorris, yoshfuji, kaber, netdev, linux-kernel
In-Reply-To: <1311548970-27522-1-git-send-email-schaman@sch.bme.hu>

From: "Zoltan, Kiss" <schaman@sch.bme.hu>
Date: Mon, 25 Jul 2011 01:09:30 +0200

> From: Zoltan Kiss <schaman@sch.bme.hu>
> 
> If a device event generates gratuitous ARP messages, only primary
> address is used for sending. This patch iterates through the whole
> list. Tested with 2 IP addresses configuration on bonding interface.
> 
> Signed-off-by: Zoltan Kiss <schaman@sch.bme.hu>

Applied.

^ permalink raw reply

* Re: [PATCH] Do not leave router anycast address for /127 prefixes.
From: David Miller @ 2011-07-25 23:16 UTC (permalink / raw)
  To: yoshfuji; +Cc: netdev, bjorn, brian.haley
In-Reply-To: <1311543874-16901-1-git-send-email-yoshfuji@linux-ipv6.org>

From: yoshfuji@linux-ipv6.org
Date: Mon, 25 Jul 2011 06:44:34 +0900

> From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
> 
> Original commit 2bda8a0c8af... "Disable router anycast
> address for /127 prefixes" says:
> 
> |   No need for matching code in addrconf_leave_anycast() as it
> |   will silently ignore any attempt to leave an unknown anycast
> |   address.
> 
> After analysis, because 1) we may add two or more prefixes on the
> same interface, or 2)user may have manually joined that anycast,
> we may hit chances to have anycast address which as if we had
> generated one by /127 prefix and we should not leave from subnet-
> router anycast address unconditionally.
> 
> CC: Bjørn Mork <bjorn@mork.no>
> CC: Brian Haley <brian.haley@hp.com>
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: Convert struct net_device uc_promisc to bool
From: David Miller @ 2011-07-25 23:17 UTC (permalink / raw)
  To: joe; +Cc: netdev, linux-kernel
In-Reply-To: <813214b21d4db8d08a76c648ea1d3286b441b706.1311604594.git.joe@perches.com>

From: Joe Perches <joe@perches.com>
Date: Mon, 25 Jul 2011 07:41:26 -0700

> No need to use int, its uses are boolean.
> May save a few bytes one day.
> 
> Signed-off-by: Joe Perches <joe@perches.com>

Applied.

^ permalink raw reply

* Re: [PATCH] drivers:connector:remove an unused variable *tracer*
From: David Miller @ 2011-07-25 23:16 UTC (permalink / raw)
  To: wanlong.gao; +Cc: zbr, jkosina, netdev, linux-kernel, gaowanlong
In-Reply-To: <1311476789-3858-1-git-send-email-wanlong.gao@gmail.com>

From: Wanlong Gao <wanlong.gao@gmail.com>
Date: Sun, 24 Jul 2011 11:06:29 +0800

> From: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> 
> The variable 'tracer' never be used, so remove it.
> Added by f701e5b73a1a79ea62ffd45d9e2bed4c7d5c1fd2.
> 
> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: fix eth.c kernel-doc warning
From: David Miller @ 2011-07-25 23:17 UTC (permalink / raw)
  To: rdunlap; +Cc: netdev
In-Reply-To: <20110725114101.04e84376.rdunlap@xenotime.net>

From: Randy Dunlap <rdunlap@xenotime.net>
Date: Mon, 25 Jul 2011 11:41:01 -0700

> From: Randy Dunlap <rdunlap@xenotime.net>
> 
> Fix new kernel-doc warning in eth.c:
> 
> Warning(net/ethernet/eth.c:237): No description found for parameter 'type'
> 
> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>

Applied.

^ permalink raw reply

* Re: [PATCH] acenic: use netdev_alloc_skb_ip_align
From: David Miller @ 2011-07-25 23:16 UTC (permalink / raw)
  To: shemminger; +Cc: eric.dumazet, netdev
In-Reply-To: <20110722093112.4caa39e7@nehalam.ftrdhcpuser.net>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 22 Jul 2011 09:31:12 -0700

> Take Eric's patch one step further.
> Use netdev_skb_ip_align to do setup the receive skb.
> Compile tested only.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH 0/7] More sane neigh infrastructure
From: Roland Dreier @ 2011-07-25 22:49 UTC (permalink / raw)
  To: David Miller
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20110725.141041.1092565620930748250.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Mon, Jul 25, 2011 at 2:10 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> So call the normal ARP neigh solicit stuff in your neigh ops, and do
> your local stuff there as well.
>
> See if you can make it work.

Makes sense, I'll poke at that.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: [net-next 10/10] ixgbe: convert to ndo_fix_features
From: Skidmore, Donald C @ 2011-07-25 22:42 UTC (permalink / raw)
  To: Michal Miroslaw, Kirsher, Jeffrey T
  Cc: davem@davemloft.net, netdev@vger.kernel.org, gospo@redhat.com,
	sassmann@redhat.com
In-Reply-To: <20110722080255.GA10125@rere.qmqm.pl>

>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>On Behalf Of Michal Miroslaw
>Sent: Friday, July 22, 2011 1:03 AM
>To: Kirsher, Jeffrey T
>Cc: davem@davemloft.net; Skidmore, Donald C; netdev@vger.kernel.org;
>gospo@redhat.com; sassmann@redhat.com
>Subject: Re: [net-next 10/10] ixgbe: convert to ndo_fix_features
>
>On Thu, Jul 21, 2011 at 11:09:11PM -0700, Jeff Kirsher wrote:
>> From: Don Skidmore <donald.c.skidmore@intel.com>
>>
>> Private rx_csum flags are now duplicate of netdev->features &
>> NETIF_F_RXCSUM.  We remove those duplicates and now use the
>net_device_ops
>> ndo_set_features.  This was based on the original patch submitted by
>> Michal Miroslaw <mirq-linux@rere.qmqm.pl>.  I also removed the special
>> case not requiring a reset for X540 hardware.  It is needed just as it
>is
>> in 82599 hardware.
>
>Looks mostly good now. Minor hints below.
>
>[...]
>> +static u32 ixgbe_fix_features(struct net_device *netdev, u32 data)
>> +{
>[...]
>> +	/* Turn off LRO if not RSC capable or invalid ITR settings */
>> +	if (!(adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)) {
>> +		data &= ~NETIF_F_LRO;
>> +	} else if (!(adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED) &&
>> +		   (adapter->rx_itr_setting != 1 &&
>> +		    adapter->rx_itr_setting > IXGBE_MAX_RSC_INT_RATE)) {
>> +		data &= ~NETIF_F_LRO;
>> +		e_info(probe, "rx-usecs set too low, not enabling RSC\n");
>> +	}
>
>Better:
>
>... else if (data & NETIF_F_LRO && adapter->rx_itr_setting != 1 &&
>adapter->rx_itr_setting > IXGBE_MAX_RSC_INT_RATE) {
>	e_info(...)
>	data &= ~NETIF_F_LRO;
>}
>

I see your point here, the added complexity (checking IXGBE_FLAG2_RSC_ENABLED) is there to cover cases that come up in our out of tree driver's kcompat code.  This is why we mirror these feature flags to begin with.  I would like to modify the driver to use only the feature flags but that will require some interesting kcompat code as well as touch large parts of the driver so I was thinking would probably be best as another patch.  

My concern in this conditional is when RSC is not just possible but actually turned on.  Currently the NETIF_F_LRO flag is enabled in probe if we are RSC capable.  But it is possible to be RSC capable but not have it enabled due to setting interrupts throttling rates too high.

>> +
>> +	return data;
>> +}
>> +
>> +static int ixgbe_set_features(struct net_device *netdev, u32 data)
>> +{
>> +	struct ixgbe_adapter *adapter = netdev_priv(netdev);
>> +	bool need_reset = false;
>> +
>> +	/* If Rx checksum is disabled, then RSC/LRO should also be
>disabled */
>> +	if (!(data & NETIF_F_RXCSUM))
>> +		adapter->flags &= ~IXGBE_FLAG_RX_CSUM_ENABLED;
>> +	else
>> +		adapter->flags |= IXGBE_FLAG_RX_CSUM_ENABLED;
>
>This exactly mirrors NETIF_F_RXCSUM. Waiting for later cleanup?

That's pretty much it, I would like to clean up all of these after figuring out the best way to create the kcompat code for the out of tree driver that has to support old kernel versions.  Seems like that would be worthy of its own patch.

>
>[...]
>> +	/*
>> +	 * Check if Flow Director n-tuple support was enabled or disabled.
>If
>> +	 * the state changed, we need to reset.
>> +	 */
>> +	if (!(adapter->flags & IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) {
>> +		/* turn off ATR, enable perfect filters and reset */
>> +		if (data & NETIF_F_NTUPLE) {
>> +			adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
>> +			adapter->flags |= IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
>> +			need_reset = true;
>> +		}
>> +	} else if (!(data & NETIF_F_NTUPLE)) {
>> +		/* turn off Flow Director, set ATR and reset */
>> +		adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
>> +		if ((adapter->flags &  IXGBE_FLAG_RSS_ENABLED) &&
>> +		    !(adapter->flags &  IXGBE_FLAG_DCB_ENABLED))
>> +			adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
>> +		need_reset = true;
>> +	}
>
>You could make this more readable:
>
>old = adapter->flags;
>if (data & NETIF_F_NTUPLE) {
>	adapter->flags |= IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
>	adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
>} else {
>	adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
>	if ((adapter->flags &  IXGBE_FLAG_RSS_ENABLED) &&
>	    !(adapter->flags &  IXGBE_FLAG_DCB_ENABLED))
>		adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
>}
>if (old != adapter->flags)
>	need_reset = true;
>

I agree this is a lot more readable.  Wish I would have thought of it. :)

I'll send out a new patch with at least this modification.

>> +
>> +	if (need_reset)
>> +		ixgbe_do_reset(netdev);
>> +
>> +	return 0;
>> +
>> +}
>> +
>>  static const struct net_device_ops ixgbe_netdev_ops = {
>>  	.ndo_open		= ixgbe_open,
>>  	.ndo_stop		= ixgbe_close,
>> @@ -7153,6 +7228,8 @@ static const struct net_device_ops
>ixgbe_netdev_ops = {
>>  	.ndo_fcoe_disable = ixgbe_fcoe_disable,
>>  	.ndo_fcoe_get_wwn = ixgbe_fcoe_get_wwn,
>>  #endif /* IXGBE_FCOE */
>> +	.ndo_set_features = ixgbe_set_features,
>> +	.ndo_fix_features = ixgbe_fix_features,
>>  };
>>
>>  static void __devinit ixgbe_probe_vf(struct ixgbe_adapter *adapter,
>> @@ -7420,20 +7497,24 @@ static int __devinit ixgbe_probe(struct
>pci_dev *pdev,
>>
>>  	netdev->features = NETIF_F_SG |
>>  			   NETIF_F_IP_CSUM |
>> +			   NETIF_F_IPV6_CSUM |
>>  			   NETIF_F_HW_VLAN_TX |
>>  			   NETIF_F_HW_VLAN_RX |
>> -			   NETIF_F_HW_VLAN_FILTER;
>> +			   NETIF_F_HW_VLAN_FILTER |
>> +			   NETIF_F_TSO |
>> +			   NETIF_F_TSO6 |
>> +			   NETIF_F_GRO |
>> +			   NETIF_F_RXHASH |
>> +			   NETIF_F_RXCSUM;
>
>Drop NETIF_F_GRO here, as its always set by network core now.
>

Will do. 

>> -	netdev->features |= NETIF_F_IPV6_CSUM;
>> -	netdev->features |= NETIF_F_TSO;
>> -	netdev->features |= NETIF_F_TSO6;
>> -	netdev->features |= NETIF_F_GRO;
>> -	netdev->features |= NETIF_F_RXHASH;
>> +	netdev->hw_features = netdev->features;
>>
>>  	switch (adapter->hw.mac.type) {
>>  	case ixgbe_mac_82599EB:
>>  	case ixgbe_mac_X540:
>>  		netdev->features |= NETIF_F_SCTP_CSUM;
>> +		netdev->hw_features |= NETIF_F_SCTP_CSUM |
>> +				       NETIF_F_NTUPLE;
>
>NTUPLE disabled by default. That's the idea?

We default with ATR on which disables perfect filters (NETIF_F_NTUPLE). 

>
>Best Regards,
>Michał Mirosław
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] cxgb3i: ref count cdev access to prevent modification while in use
From: Divy Le Ray @ 2011-07-25 21:19 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Steve Wise, David S. Miller, Karen Xie
In-Reply-To: <1311623817-6417-1-git-send-email-nhorman@tuxdriver.com>

On 07/25/2011 12:56 PM, Neil Horman wrote:
>
> This oops was reported recently:
> d:mon> e
> cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120]
>     pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
>     lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
>     sp: c0000000fd4c73a0
>    msr: 8000000000009032
>    dar: 0
>  dsisr: 40000000
>   current = 0xc0000000fd640d40
>   paca    = 0xc00000000054ff80
>     pid   = 5085, comm = iscsid
> d:mon> t
> [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
> [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 
> [libcxgbi]
> [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
> [scsi_transport_iscsi2]
> [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4
> [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c
> [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c
> [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8
> [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac
> [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c
> [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
> --- Exception: c01 (System Call) at 00000080da560cfc
>
> The root cause was an EEH error, which sent us down the offload_close 
> path in
> the cxgb3 driver, which in turn sets cdev->lldev to NULL, without 
> regard for
> upper layer driver (like the cxgbi drivers) which might have execution 
> contexts
> in the middle of its use. The result is the oops above, when 
> t3_l2t_get attempts
> to dereference cdev->lldev right after the EEH error handler sets it 
> to NULL.
>
> The fix is to reference count the cdev structure.  When an EEH error 
> occurs, the
> shutdown path:
> t3_adapter_error->offload_close->cxgb3i_remove_clients->cxgb3i_dev_close
> will now block until such time as the cdev pointer has a use count of 
> zero.
> This coupled with the fact that lookups will now skip finding any 
> registered
> cdev's in cxgbi_device_find_by_[lldev|netdev] with the 
> CXGBI_FLAG_ADAPTER_RESET
> bit set ensures that on an EEH, the setting of lldev to NULL in 
> offload_close
> will only happen after there are no longer any active users of the data
> structure.
>
> This has been tested by the reporter and shown to fix the reproted oops
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Divy Le Ray <divy@chelsio.com>
> CC: Steve Wise <swise@chelsio.com>
> CC: "David S. Miller" <davem@davemloft.net>
>

Also cc-ing Karen.

> ---
>  drivers/scsi/cxgbi/cxgb3i/cxgb3i.c |    8 +++++++-
>  drivers/scsi/cxgbi/libcxgbi.c      |    9 +++++++++
>  drivers/scsi/cxgbi/libcxgbi.h      |    3 +++
>  3 files changed, 19 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c 
> b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> index abc7b12..7d752cd 100644
> --- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> +++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
> @@ -1301,9 +1301,13 @@ static void cxgb3i_dev_close(struct t3cdev *t3dev)
>
>         if (!cdev || cdev->flags & CXGBI_FLAG_ADAPTER_RESET) {
>                 pr_info("0x%p close, f 0x%x.\n", cdev, cdev ? 
> cdev->flags : 0);
> +               if (cdev)
> +                       cdev_put(cdev);
> +               while (cdev && atomic_read(&cdev->use_count) != 0)
> +                       msleep(1);
>                 return;
>         }
> -
> +       cdev_put(cdev);
>         cxgbi_device_unregister(cdev);
>  }
>
> @@ -1318,6 +1322,7 @@ static void cxgb3i_dev_open(struct t3cdev *t3dev)
>         int i, err;
>
>         if (cdev) {
> +               cdev_put(cdev);
>                 pr_info("0x%p, updating.\n", cdev);
>                 return;
>         }
> @@ -1390,6 +1395,7 @@ static void cxgb3i_dev_event_handler(struct 
> t3cdev *t3dev, u32 event, u32 port)
>                 cdev->flags &= ~CXGBI_FLAG_ADAPTER_RESET;
>                 break;
>         }
> +       cdev_put(cdev);
>  }
>
>  /**
> diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
> index 77ac217..eb5625d 100644
> --- a/drivers/scsi/cxgbi/libcxgbi.c
> +++ b/drivers/scsi/cxgbi/libcxgbi.c
> @@ -181,6 +181,9 @@ struct cxgbi_device 
> *cxgbi_device_find_by_lldev(void *lldev)
>         mutex_lock(&cdev_mutex);
>         list_for_each_entry_safe(cdev, tmp, &cdev_list, list_head) {
>                 if (cdev->lldev == lldev) {
> +                       if (cdev->flags & CXGBI_FLAG_ADAPTER_RESET)
> +                               continue;
> +                       cdev_hold(cdev);
>                         mutex_unlock(&cdev_mutex);
>                         return cdev;
>                 }
> @@ -210,7 +213,10 @@ static struct cxgbi_device 
> *cxgbi_device_find_by_netdev(struct net_device *ndev,
>         list_for_each_entry_safe(cdev, tmp, &cdev_list, list_head) {
>                 for (i = 0; i < cdev->nports; i++) {
>                         if (ndev == cdev->ports[i]) {
> +                               if (cdev->flags & 
> CXGBI_FLAG_ADAPTER_RESET)
> +                                       continue;
>                                 cdev->hbas[i]->vdev = vdev;
> +                               cdev_hold(cdev);
>                                 mutex_unlock(&cdev_mutex);
>                                 if (port)
>                                         *port = i;
> @@ -542,6 +548,8 @@ rel_rt:
>         if (csk)
>                 cxgbi_sock_closed(csk);
>  err_out:
> +       if (cdev)
> +               cdev_put(cdev);
>         return ERR_PTR(err);
>  }
>
> @@ -2491,6 +2499,7 @@ struct iscsi_endpoint *cxgbi_ep_connect(struct 
> Scsi_Host *shost,
>         return ep;
>
>  release_conn:
> +       cdev_put(&csk->cdev);
>         cxgbi_sock_put(csk);
>         cxgbi_sock_closed(csk);
>  err_out:
> diff --git a/drivers/scsi/cxgbi/libcxgbi.h b/drivers/scsi/cxgbi/libcxgbi.h
> index 9267844..aad1749 100644
> --- a/drivers/scsi/cxgbi/libcxgbi.h
> +++ b/drivers/scsi/cxgbi/libcxgbi.h
> @@ -514,6 +514,7 @@ struct cxgbi_device {
>         unsigned int flags;
>         struct net_device **ports;
>         void *lldev;
> +       atomic_t use_count;
>         struct cxgbi_hba **hbas;
>         const unsigned short *mtus;
>         unsigned char nmtus;
> @@ -557,6 +558,8 @@ struct cxgbi_device {
>         void *dd_data;
>  };
>  #define cxgbi_cdev_priv(cdev)  ((cdev)->dd_data)
> +#define cdev_hold(x) do {atomic_inc(&x->use_count);} while(0)
> +#define cdev_put(x) do {atomic_dec(&x->use_count);} while(0)
>
>  struct cxgbi_conn {
>         struct cxgbi_endpoint *cep;
> --
> 1.7.3.4
>


^ permalink raw reply

* ipoib crash when booting with ipautoconfig
From: Yinghai Lu @ 2011-07-25 21:39 UTC (permalink / raw)
  To: David Miller, NetDev

[   88.946112] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000040
[   88.954153] IP: [<ffffffff81a61ee1>] ipoib_start_xmit+0x5c/0x31a
[   88.960196] PGD 0
[   88.962235] Oops: 0000 [#1] SMP
[   88.965500] CPU 0
[   88.967343] Modules linked in:
[   88.970601]
[   88.972094] Pid: 1, comm: swapper Not tainted
3.0.0-tip-yh-04912-gf572f66-dirty #1148
[   88.984053] RIP: 0010:[<ffffffff81a61ee1>]  [<ffffffff81a61ee1>]
ipoib_start_xmit+0x5c/0x31a
[   88.992497] RSP: 0018:ffff88046dd37cb0  EFLAGS: 00010246
[   88.997805] RAX: 0000000000000000 RBX: ffff88045a5af380 RCX: 0000000060014c20
[   89.004933] RDX: 0000000000000000 RSI: ffff880464768000 RDI: ffff88045a5af380
[   89.012060] RBP: ffff88046dd37cf0 R08: 0000000000001d1c R09: 000000000039e295
[   89.019188] R10: 0000000000000000 R11: ffff88046dd38708 R12: ffff880464768000
[   89.026317] R13: 0000000000000000 R14: ffff8804647689c0 R15: ffff880464768000
[   89.033447] FS:  0000000000000000(0000) GS:ffff88047de00000(0000)
knlGS:0000000000000000
[   89.041524] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   89.047268] CR2: 0000000000000040 CR3: 00000000023fc000 CR4: 00000000000406f0
[   89.054394] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   89.061523] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   89.068652] Process swapper (pid: 1, threadinfo ffff88046dd36000,
task ffff88046dd38000)
[   89.076729] Stack:
[   89.078741]  0000000000000000 ffff88046dd38000 0000000000000246
ffff88045a5af380
[   89.086189]  0000000060014c23 0000000000000000 0000000000000258
ffff880464768000
[   89.093645]  ffff88046dd37d50 ffffffff81b15a31 ffff880670687e98
ffff880670687e80
[   89.101093] Call Trace:
[   89.103549]  [<ffffffff81b15a31>] dev_hard_start_xmit+0x2a7/0x435
[   89.109658]  [<ffffffff81b2984d>] sch_direct_xmit+0x74/0x207
[   89.115317]  [<ffffffff81b15db9>] dev_queue_xmit+0x1fa/0x4aa
[   89.120981]  [<ffffffff81b15bbf>] ? dev_hard_start_xmit+0x435/0x435
[   89.127267]  [<ffffffff81b0abd2>] ? __alloc_skb+0x83/0x141
[   89.132774]  [<ffffffff82754586>] ic_bootp_send_if+0x2b4/0x2d2
[   89.138597]  [<ffffffff82754709>] ic_dynamic+0x165/0x315
[   89.143913]  [<ffffffff8275508e>] ip_auto_config+0xf2/0x2c0
[   89.149484]  [<ffffffff82754f9c>] ? root_nfs_parse_addr+0xb5/0xb5
[   89.155604]  [<ffffffff810002cf>] do_one_initcall+0x57/0x134
[   89.161278]  [<ffffffff82708f85>] kernel_init+0x11f/0x1a3
[   89.166699]  [<ffffffff81c471d4>] kernel_thread_helper+0x4/0x10
[   89.172640]  [<ffffffff81c3e61d>] ? retint_restore_args+0xe/0xe
[   89.178550]  [<ffffffff82708e66>] ? start_kernel+0x3c9/0x3c9
[   89.184212]  [<ffffffff81c471d0>] ? gs_change+0xb/0xb
[   89.189261] Code: 1d 48 c7 c7 60 11 42 82 e8 e0 d7 64 ff 85 c0 75
0d e8 91 5e 63 ff 85 c0 0f 84 89 02 00 00 48 8b 43 58 48 89 c2 48 83
e2 fe a8 01 <4c> 8b 7a 40 74 26 e8 3c 5e 63 ff 85 c0 74 1d 48 c7 c7 60
11 42
[   89.209206] RIP  [<ffffffff81a61ee1>] ipoib_start_xmit+0x5c/0x31a
[   89.215311]  RSP <ffff88046dd37cb0>
[   89.218792] CR2: 0000000000000040
[   89.222287] ---[ end trace 88d8b543880f18a5 ]---

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox