Netdev List
 help / color / mirror / Atom feed
* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Eric Dumazet @ 2011-10-30 15:16 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Ben Hutchings, linux-kernel, netdev,
	Andrew Morton
In-Reply-To: <20111030095918.GA19676@one.firstfloor.org>

Le dimanche 30 octobre 2011 à 10:59 +0100, Andi Kleen a écrit :
> > +#define ACCESS_AT_MOST_ONCE(x)			\
> > +	({	unsigned long __y;		\
> 
> why not typeof here?
> 
> > +		asm("":"=r" (__y):"0" (x));	\
> > +		(__force __typeof__(x)) __y;	\
> > +	})
> > +
> 
> -Andi

Because it doesnt work if x is const.

/data/src/linux/arch/x86/include/asm/atomic.h: In function
‘atomic_read’:
/data/src/linux/arch/x86/include/asm/atomic.h:25:2: erreur: read-only
variable ‘__y’ used as ‘asm’ output

I understand it wont work for u64 type on 32bit arches, but is
ACCESS_AT_MOST_ONCE() sensible for this kind of usage ?

In this V2, I added a check on sizeof(x) to trigger a compile error.

BTW, I forgot the atomic64_read() possible use of ACCESS_AT_MOST_ONCE()
in arch/x86/include/asm/atomic64_64.h, this saves 600 bytes more :)

On 32bit, I am afraid we cannot change current behavior, because of the
ATOMIC64_ALTERNATIVE() use.

Thanks !

[PATCH] atomic: introduce ACCESS_AT_MOST_ONCE() helper

In commit 4e60c86bd9e (gcc-4.6: mm: fix unused but set warnings)
Andi forced VM_BUG_ON(cond) to evaluate cond, even if CONFIG_DEBUG_VM is
not set :

#ifdef CONFIG_DEBUG_VM
#define VM_BUG_ON(cond) BUG_ON(cond)
#else
#define VM_BUG_ON(cond) do { (void)(cond); } while (0)
#endif

As a side effect, get_page()/put_page_testzero() are performing more bus
transactions on contended cache line on some workloads (tcp_sendmsg()
for example, where a page is acting as a shared buffer)

0,05 :  ffffffff815e4775:       je     ffffffff815e4970 <tcp_sendmsg+0xc80>
0,05 :  ffffffff815e477b:       mov    0x1c(%r9),%eax    // useless
3,32 :  ffffffff815e477f:       mov    (%r9),%rax        // useless
0,51 :  ffffffff815e4782:       lock incl 0x1c(%r9)
3,87 :  ffffffff815e4787:       mov    (%r9),%rax
0,00 :  ffffffff815e478a:       test   $0x80,%ah
0,00 :  ffffffff815e478d:       jne    ffffffff815e49f2 <tcp_sendmsg+0xd02>

Thats because both atomic_read() and constant_test_bit() use a volatile
attribute and thus compiler is forced to perform a read, even if the
result is optimized away.

Linus suggested using an asm("") trick and place it in a variant of
ACCESS_ONCE(), allowing compiler to omit reading memory if result is
unused.

This patch introduces ACCESS_AT_MOST_ONCE() helper and use it in the x86
implementation of atomic_read() and constant_test_bit()

It's also used on x86_64 atomic64_read() implementation.

on x86_64, we thus reduce vmlinux text a bit (if CONFIG_DEBUG_VM=n)

# size vmlinux.old vmlinux.new
   text    data     bss     dec     hex filename
10706848        2894216 1540096 15141160         e70928 vmlinux.old
10704040	2894216	1540096	15138352	 e6fe30	vmlinux.new

Based on a prior patch from Linus, and review from Andi

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
V2: Add a check on sizeof(x) in ACCESS_AT_MOST_ONCE()
    Use ACCESS_AT_MOST_ONCE() on x86_64 atomic64_read()

 arch/x86/include/asm/atomic.h      |    2 +-
 arch/x86/include/asm/atomic64_64.h |    2 +-
 arch/x86/include/asm/bitops.h      |    7 +++++--
 include/asm-generic/atomic.h       |    2 +-
 include/linux/compiler.h           |   15 +++++++++++++++
 5 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 58cb6d4..b1f0c6b 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -22,7 +22,7 @@
  */
 static inline int atomic_read(const atomic_t *v)
 {
-	return (*(volatile int *)&(v)->counter);
+	return ACCESS_AT_MOST_ONCE(v->counter);
 }
 
 /**
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 0e1cbfc..bdca6fa 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -18,7 +18,7 @@
  */
 static inline long atomic64_read(const atomic64_t *v)
 {
-	return (*(volatile long *)&(v)->counter);
+	return ACCESS_AT_MOST_ONCE(v->counter);
 }
 
 /**
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 1775d6e..e30a190 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -308,8 +308,11 @@ static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
 
 static __always_inline int constant_test_bit(unsigned int nr, const volatile unsigned long *addr)
 {
-	return ((1UL << (nr % BITS_PER_LONG)) &
-		(addr[nr / BITS_PER_LONG])) != 0;
+	const unsigned long *word = (const unsigned long *)addr +
+				    (nr / BITS_PER_LONG);
+	unsigned long bit = 1UL << (nr % BITS_PER_LONG);
+
+	return (bit & ACCESS_AT_MOST_ONCE(*word)) != 0;
 }
 
 static inline int variable_test_bit(int nr, volatile const unsigned long *addr)
diff --git a/include/asm-generic/atomic.h b/include/asm-generic/atomic.h
index e37963c..c05e21f 100644
--- a/include/asm-generic/atomic.h
+++ b/include/asm-generic/atomic.h
@@ -39,7 +39,7 @@
  * Atomically reads the value of @v.
  */
 #ifndef atomic_read
-#define atomic_read(v)	(*(volatile int *)&(v)->counter)
+#define atomic_read(v)	ACCESS_AT_MOST_ONCE((v)->counter)
 #endif
 
 /**
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 320d6c9..bd18562 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -308,4 +308,19 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
  */
 #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
 
+#ifndef __ASSEMBLY__
+/*
+ * Like ACCESS_ONCE, but can be optimized away if nothing uses the value,
+ * and/or merged with previous non-ONCE accesses.
+ */
+extern void ACCESS_AT_MOST_ONCE_bad(void);
+#define ACCESS_AT_MOST_ONCE(x)				\
+	({	unsigned long __y;			\
+		if (sizeof(x) > sizeof(__y))		\
+			ACCESS_AT_MOST_ONCE_bad();	\
+		asm("":"=r" (__y):"0" (x));		\
+		(__force __typeof__(x)) __y;		\
+	})
+#endif /* __ASSEMBLY__ */
+
 #endif /* __LINUX_COMPILER_H */

^ permalink raw reply related

* [PATCH] batman-adv: Fix range check for expected packets
From: Simon Wunderlich @ 2011-10-30 15:22 UTC (permalink / raw)
  To: b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Simon Wunderlich
In-Reply-To: <4EAC49D7.2060609-XXsH3GEs1jrby3iVrkZq2A@public.gmane.org>

The check for new packets in the future used a wrong binary operator,
which makes the check expression always true and accepting too many
packets.

Reported-by: Thomas Jarosch <thomas.jarosch-XXsH3GEs1jrby3iVrkZq2A@public.gmane.org>
Signed-off-by: Simon Wunderlich <siwu-MaAgPAbsBIVS8oHt8HbXEIQuADTiUCJX@public.gmane.org>
---
 bitarray.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/bitarray.c b/bitarray.c
index 0be9ff3..9bc63b2 100644
--- a/bitarray.c
+++ b/bitarray.c
@@ -155,7 +155,7 @@ int bit_get_packet(void *priv, unsigned long *seq_bits,
 	/* sequence number is much newer, probably missed a lot of packets */
 
 	if ((seq_num_diff >= TQ_LOCAL_WINDOW_SIZE)
-		|| (seq_num_diff < EXPECTED_SEQNO_RANGE)) {
+		&& (seq_num_diff < EXPECTED_SEQNO_RANGE)) {
 		bat_dbg(DBG_BATMAN, bat_priv,
 			"We missed a lot of packets (%i) !\n",
 			seq_num_diff - 1);
-- 
1.7.7

^ permalink raw reply related

* [PATCH] net: make the tcp and udp file_operations for the /proc stuff const
From: Arjan van de Ven @ 2011-10-30 16:46 UTC (permalink / raw)
  To: netdev, davem

Hi.

as most of you probably already know, there's a strong desire to make struct file_operations
(and similar structures) const throughout the kernel. I'm sure something like this patch
has been posted here before, so I realize I'm threading in a tricky part ;-)

I'll try anyway with the patch below at least to get feedback on how to do this thing better
if nothing else...



>From 85d9ba34b3a6ad60a2b5ac3421eebdb5bbf81f7d Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Sun, 30 Oct 2011 09:25:32 -0700
Subject: [PATCH] net: make the tcp and udp file_operations for the /proc stuff const

the tcp and udp code creates a set of struct file_operations at runtime
while it can also be done at compile time, with the added benefit of then
having these file operations be const.

the trickiest part was to get the "THIS_MODULE" reference right; the naive
method of declaring a struct in the place of registration would not work
for this reason.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 include/net/tcp.h   |   10 ++++++----
 include/net/udp.h   |   12 +++++++-----
 net/ipv4/tcp_ipv4.c |   22 ++++++++++++----------
 net/ipv4/udp.c      |   22 ++++++++++++----------
 net/ipv4/udplite.c  |   13 ++++++++++---
 net/ipv6/tcp_ipv6.c |   12 +++++++++---
 net/ipv6/udp.c      |   12 +++++++++---
 net/ipv6/udplite.c  |   13 ++++++++++---
 8 files changed, 75 insertions(+), 41 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index acc620a..b9cdfe3 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1398,11 +1398,13 @@ enum tcp_seq_states {
 	TCP_SEQ_STATE_TIME_WAIT,
 };
 
+int tcp_seq_open(struct inode *inode, struct file *file);
+
 struct tcp_seq_afinfo {
-	char			*name;
-	sa_family_t		family;
-	struct file_operations	seq_fops;
-	struct seq_operations	seq_ops;
+	char				*name;
+	sa_family_t			family;
+	const struct file_operations	*seq_fops;
+	struct seq_operations		seq_ops;
 };
 
 struct tcp_iter_state {
diff --git a/include/net/udp.h b/include/net/udp.h
index 67ea6fc..3b285f4 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -230,12 +230,14 @@ extern struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *sadd
 #endif
 
 /* /proc */
+int udp_seq_open(struct inode *inode, struct file *file);
+
 struct udp_seq_afinfo {
-	char			*name;
-	sa_family_t		family;
-	struct udp_table	*udp_table;
-	struct file_operations	seq_fops;
-	struct seq_operations	seq_ops;
+	char				*name;
+	sa_family_t			family;
+	struct udp_table		*udp_table;
+	const struct file_operations	*seq_fops;
+	struct seq_operations		seq_ops;
 };
 
 struct udp_iter_state {
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 7963e03..d43dc07 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2336,7 +2336,7 @@ static void tcp_seq_stop(struct seq_file *seq, void *v)
 	}
 }
 
-static int tcp_seq_open(struct inode *inode, struct file *file)
+int tcp_seq_open(struct inode *inode, struct file *file)
 {
 	struct tcp_seq_afinfo *afinfo = PDE(inode)->data;
 	struct tcp_iter_state *s;
@@ -2352,23 +2352,19 @@ static int tcp_seq_open(struct inode *inode, struct file *file)
 	s->last_pos 		= 0;
 	return 0;
 }
+EXPORT_SYMBOL(tcp_seq_open);
 
 int tcp_proc_register(struct net *net, struct tcp_seq_afinfo *afinfo)
 {
 	int rc = 0;
 	struct proc_dir_entry *p;
 
-	afinfo->seq_fops.open		= tcp_seq_open;
-	afinfo->seq_fops.read		= seq_read;
-	afinfo->seq_fops.llseek		= seq_lseek;
-	afinfo->seq_fops.release	= seq_release_net;
-
 	afinfo->seq_ops.start		= tcp_seq_start;
 	afinfo->seq_ops.next		= tcp_seq_next;
 	afinfo->seq_ops.stop		= tcp_seq_stop;
 
 	p = proc_create_data(afinfo->name, S_IRUGO, net->proc_net,
-			     &afinfo->seq_fops, afinfo);
+			     afinfo->seq_fops, afinfo);
 	if (!p)
 		rc = -ENOMEM;
 	return rc;
@@ -2517,12 +2513,18 @@ out:
 	return 0;
 }
 
+static const struct file_operations tcp_afinfo_seq_fops = {
+	.owner   = THIS_MODULE,
+	.open    = tcp_seq_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release_net
+};
+
 static struct tcp_seq_afinfo tcp4_seq_afinfo = {
 	.name		= "tcp",
 	.family		= AF_INET,
-	.seq_fops	= {
-		.owner		= THIS_MODULE,
-	},
+	.seq_fops	= &tcp_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= tcp4_seq_show,
 	},
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 1b5a193..25b869a 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2038,7 +2038,7 @@ static void udp_seq_stop(struct seq_file *seq, void *v)
 		spin_unlock_bh(&state->udp_table->hash[state->bucket].lock);
 }
 
-static int udp_seq_open(struct inode *inode, struct file *file)
+int udp_seq_open(struct inode *inode, struct file *file)
 {
 	struct udp_seq_afinfo *afinfo = PDE(inode)->data;
 	struct udp_iter_state *s;
@@ -2054,6 +2054,7 @@ static int udp_seq_open(struct inode *inode, struct file *file)
 	s->udp_table		= afinfo->udp_table;
 	return err;
 }
+EXPORT_SYMBOL(udp_seq_open);
 
 /* ------------------------------------------------------------------------ */
 int udp_proc_register(struct net *net, struct udp_seq_afinfo *afinfo)
@@ -2061,17 +2062,12 @@ int udp_proc_register(struct net *net, struct udp_seq_afinfo *afinfo)
 	struct proc_dir_entry *p;
 	int rc = 0;
 
-	afinfo->seq_fops.open		= udp_seq_open;
-	afinfo->seq_fops.read		= seq_read;
-	afinfo->seq_fops.llseek		= seq_lseek;
-	afinfo->seq_fops.release	= seq_release_net;
-
 	afinfo->seq_ops.start		= udp_seq_start;
 	afinfo->seq_ops.next		= udp_seq_next;
 	afinfo->seq_ops.stop		= udp_seq_stop;
 
 	p = proc_create_data(afinfo->name, S_IRUGO, net->proc_net,
-			     &afinfo->seq_fops, afinfo);
+			     afinfo->seq_fops, afinfo);
 	if (!p)
 		rc = -ENOMEM;
 	return rc;
@@ -2121,14 +2117,20 @@ int udp4_seq_show(struct seq_file *seq, void *v)
 	return 0;
 }
 
+static const struct file_operations udp_afinfo_seq_fops = {
+	.owner    = THIS_MODULE,
+	.open     = udp_seq_open,
+	.read     = seq_read,
+	.llseek   = seq_lseek,
+	.release  = seq_release_net
+};
+
 /* ------------------------------------------------------------------------ */
 static struct udp_seq_afinfo udp4_seq_afinfo = {
 	.name		= "udp",
 	.family		= AF_INET,
 	.udp_table	= &udp_table,
-	.seq_fops	= {
-		.owner	=	THIS_MODULE,
-	},
+	.seq_fops	= &udp_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= udp4_seq_show,
 	},
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index aee9963..08383eb 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -71,13 +71,20 @@ static struct inet_protosw udplite4_protosw = {
 };
 
 #ifdef CONFIG_PROC_FS
+
+static const struct file_operations udplite_afinfo_seq_fops = {
+	.owner    = THIS_MODULE,
+	.open     = udp_seq_open,
+	.read     = seq_read,
+	.llseek   = seq_lseek,
+	.release  = seq_release_net
+};
+
 static struct udp_seq_afinfo udplite4_seq_afinfo = {
 	.name		= "udplite",
 	.family		= AF_INET,
 	.udp_table 	= &udplite_table,
-	.seq_fops	= {
-		.owner	=	THIS_MODULE,
-	},
+	.seq_fops	= &udplite_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= udp4_seq_show,
 	},
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 7b8fc57..d0fde8c 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2158,12 +2158,18 @@ out:
 	return 0;
 }
 
+static const struct file_operations tcp6_afinfo_seq_fops = {
+	.owner   = THIS_MODULE,
+	.open    = tcp_seq_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release_net
+};
+
 static struct tcp_seq_afinfo tcp6_seq_afinfo = {
 	.name		= "tcp6",
 	.family		= AF_INET6,
-	.seq_fops	= {
-		.owner		= THIS_MODULE,
-	},
+	.seq_fops	= &tcp6_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= tcp6_seq_show,
 	},
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index bb95e8e..37f654d 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1424,13 +1424,19 @@ int udp6_seq_show(struct seq_file *seq, void *v)
 	return 0;
 }
 
+static const struct file_operations udp6_afinfo_seq_fops = {
+	.owner    = THIS_MODULE,
+	.open     = udp_seq_open,
+	.read     = seq_read,
+	.llseek   = seq_lseek,
+	.release  = seq_release_net
+};
+
 static struct udp_seq_afinfo udp6_seq_afinfo = {
 	.name		= "udp6",
 	.family		= AF_INET6,
 	.udp_table	= &udp_table,
-	.seq_fops	= {
-		.owner	=	THIS_MODULE,
-	},
+	.seq_fops	= &udp6_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= udp6_seq_show,
 	},
diff --git a/net/ipv6/udplite.c b/net/ipv6/udplite.c
index 986c4de..8889aa2 100644
--- a/net/ipv6/udplite.c
+++ b/net/ipv6/udplite.c
@@ -93,13 +93,20 @@ void udplitev6_exit(void)
 }
 
 #ifdef CONFIG_PROC_FS
+
+static const struct file_operations udplite6_afinfo_seq_fops = {
+	.owner    = THIS_MODULE,
+	.open     = udp_seq_open,
+	.read     = seq_read,
+	.llseek   = seq_lseek,
+	.release  = seq_release_net
+};
+
 static struct udp_seq_afinfo udplite6_seq_afinfo = {
 	.name		= "udplite6",
 	.family		= AF_INET6,
 	.udp_table	= &udplite_table,
-	.seq_fops	= {
-		.owner	=	THIS_MODULE,
-	},
+	.seq_fops	= &udplite6_afinfo_seq_fops,
 	.seq_ops	= {
 		.show		= udp6_seq_show,
 	},
-- 
1.7.6


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply related

* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Linus Torvalds @ 2011-10-30 17:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, Ben Hutchings, linux-kernel, netdev, Andrew Morton
In-Reply-To: <1319987765.13597.60.camel@edumazet-laptop>

On Sun, Oct 30, 2011 at 8:16 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Because it doesnt work if x is const.

Just remove the const. Problem solved.

Both cases of 'const' are totally arbitrary and useless. The
test_bit() one is literally a cast to const (admittedly also *from*
const, but nobody cares), and the atomic_read() one is just because it
uses a silly inline function where a macro would be simpler.

                              Linus

^ permalink raw reply

* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Eric Dumazet @ 2011-10-30 17:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Ben Hutchings, linux-kernel, netdev, Andrew Morton
In-Reply-To: <CA+55aFyGEhBMv31QXN7q9PJc36TVtHLOvdFYB1+6NTo+nKSkbQ@mail.gmail.com>

Le dimanche 30 octobre 2011 à 10:07 -0700, Linus Torvalds a écrit :
> On Sun, Oct 30, 2011 at 8:16 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > Because it doesnt work if x is const.
> 
> Just remove the const. Problem solved.
> 
> Both cases of 'const' are totally arbitrary and useless. The
> test_bit() one is literally a cast to const (admittedly also *from*
> const, but nobody cares), and the atomic_read() one is just because it
> uses a silly inline function where a macro would be simpler.
> 

Oh well, I am lost. I always considered inline functione better because
of prototype checks.


Changing atomic_read(const atomic_t *v) prototype to
atomic_read(atomic_t *v) is not an option.


To save your time and my time, please select your favorite between :

1) The patch I did

2) 
 static inline int atomic_read(const atomic_t *v)
 {
	return ACCESS_AT_MOST_ONCE(((atomic_t *)v)->counter);
 }

3) 
 static inline int atomic_read(const atomic_t *v)
 {
	return ACCESS_AT_MOST_ONCE(*(int *)&(v)->counter);
 }

4) macro (I personnaly dont like it)
#define atomic_read(v) ACCESS_AT_MOST_ONCE(*(int *)&(v)->counter)

Thanks

^ permalink raw reply

* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Linus Torvalds @ 2011-10-30 17:48 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, Ben Hutchings, linux-kernel, netdev, Andrew Morton
In-Reply-To: <1319996494.13597.69.camel@edumazet-laptop>

On Sun, Oct 30, 2011 at 10:41 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Changing atomic_read(const atomic_t *v) prototype to
> atomic_read(atomic_t *v) is not an option.

Why not?

    #define atomic_read(v)     ACCESS_AT_MOST_ONCE((v)->counter)

seems to be the cleanest thing.

And if you don't think this is "an option", I really can't see why you
care about the extra instructions in the code stream either.

> 4) macro (I personnaly dont like it)
> #define atomic_read(v) ACCESS_AT_MOST_ONCE(*(int *)&(v)->counter)

Why the *hell* would you have that cast there?

If somebody passes "const atomic_t"'s around, then just shoot the
bastard. The concept makes no sense.

Grepping for "const atomic_t" shows absolutely *zero* users, except
for the crazy inline function declaration itself.

Stop the insanity already. Get rid of the f*cking "const".

                      Linus

^ permalink raw reply

* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Eric Dumazet @ 2011-10-30 17:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andi Kleen, Ben Hutchings, linux-kernel, netdev, Andrew Morton
In-Reply-To: <CA+55aFz-mp8k7qdF3LaOjgX19TbzrCirGuCmGwqVMq9SwQ8yvw@mail.gmail.com>

Le dimanche 30 octobre 2011 à 10:48 -0700, Linus Torvalds a écrit :
> On Sun, Oct 30, 2011 at 10:41 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> > Changing atomic_read(const atomic_t *v) prototype to
> > atomic_read(atomic_t *v) is not an option.
> 
> Why not?
> 
>     #define atomic_read(v)     ACCESS_AT_MOST_ONCE((v)->counter)
> 
> seems to be the cleanest thing.
> 

As I said, because v can be a const pointer provided by the caller.

Try it yourself and you'll discover hundred of call sites doing

.... some_function(const struct *xxx, ...)
{
	if (atomic_read(&xxx->refcnt) <= 0)
		do_something();
	else
		do_otherthing();
}

> And if you don't think this is "an option", I really can't see why you
> care about the extra instructions in the code stream either.
> 

Not an option if we have to change all callers that expected to be able
to use a const atomic_t pointer.

OK, I now have to leave the net.

^ permalink raw reply

* Re: >Re: [RFC] should VM_BUG_ON(cond) really evaluate cond
From: Linus Torvalds @ 2011-10-30 18:09 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, Ben Hutchings, linux-kernel, netdev, Andrew Morton
In-Reply-To: <1319997593.13597.76.camel@edumazet-laptop>

On Sun, Oct 30, 2011 at 10:59 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> As I said, because v can be a const pointer provided by the caller.
>
> Try it yourself and you'll discover hundred of call sites doing
>
> .... some_function(const struct *xxx, ...)
> {
>        if (atomic_read(&xxx->refcnt) <= 0)
>                do_something();

Argh. Ok. Testing a refcount in a const struct doesn't make much
sense, but there does seem to be perfectly valid uses of it
(sk_wmem_alloc etc).

Annoying. I guess we have to have those casts. Grr.

                        Linus

^ permalink raw reply

* Re: hiberante hangs TCP Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Tejun Heo @ 2011-10-30 20:16 UTC (permalink / raw)
  To: David Fries; +Cc: netdev, linux-pm, linux-kernel
In-Reply-To: <20111030044821.GA23741@spacedout.fries.net>

(cc'ing Rafael and linux-pm)

On Sat, Oct 29, 2011 at 11:48:21PM -0500, David Fries wrote:
> I saw the write up on this on lwn.net, pretty creative by the way, and
> it got me thinking about a different checkpoint/restart problem I've
> been running into.  Specifically in hibernating to disk.  In the
> hibernate case active TCP connections hang after resuming, while an
> idle TCP connection will continue after the system is back up.  My
> observation is the kernel checkpoints itself to memory, enables
> devices, writes out that checkpoint image to storage, then powers off.
> The problem is if TCP packets are received while writing to storage,
> the kernel will continue to queue and ack those TCP packets, but the
> running kernel and it's network state is shortly lost.  When the
> computer resumes, those TCP byte sequences hang the TCP connection for
> an extended period of time while the resumed computer refuses to
> acknowledge the data that was received after checkpointing and the now
> running kernel knew nothing about, and the other computer tries in
> vain to resend any data that hadn't yet been acknowledged, which is
> always after the data that was lost, until one of them eventually
> gives up.
> 
> I've been wondering if it was safe or possible to leave any network
> interfaces down after the checkpoint, or what the right solution would
> be.  I didn't think marking every TCP connection with a ZOMBIE_KERNEL
> bit just after the kernel checkpoint (for the kernel is walking dead
> and won't remember anything that happens), and then prevent any TCP
> acks from being sent for those connections would be the right
> solution.  I've taken to unplugging the physical lan cable,
> hibernating to disk, and plugging it back in after the system is down,
> to avoid the problem.  Any ideas?

Hmmm... sounds like taking down network interfaces before starting
hibernation sequence should be enough, which shouldn't be too
difficult to implement from userland.  Rafael, what do you think?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: hiberante hangs TCP Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: David Fries @ 2011-10-30 20:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, netdev, Rafael J. Wysocki, linux-pm
In-Reply-To: <20111030201618.GA7696@google.com>

On Sun, Oct 30, 2011 at 01:16:18PM -0700, Tejun Heo wrote:
> (cc'ing Rafael and linux-pm)
> 
> On Sat, Oct 29, 2011 at 11:48:21PM -0500, David Fries wrote:
> > I saw the write up on this on lwn.net, pretty creative by the way, and
> > it got me thinking about a different checkpoint/restart problem I've
> > been running into.  Specifically in hibernating to disk.  In the
> > hibernate case active TCP connections hang after resuming, while an
> > idle TCP connection will continue after the system is back up.  My
> > observation is the kernel checkpoints itself to memory, enables
> > devices, writes out that checkpoint image to storage, then powers off.
> > The problem is if TCP packets are received while writing to storage,
> > the kernel will continue to queue and ack those TCP packets, but the
> > running kernel and it's network state is shortly lost.  When the
> > computer resumes, those TCP byte sequences hang the TCP connection for
> > an extended period of time while the resumed computer refuses to
> > acknowledge the data that was received after checkpointing and the now
> > running kernel knew nothing about, and the other computer tries in
> > vain to resend any data that hadn't yet been acknowledged, which is
> > always after the data that was lost, until one of them eventually
> > gives up.
> > 
> > I've been wondering if it was safe or possible to leave any network
> > interfaces down after the checkpoint, or what the right solution would
> > be.  I didn't think marking every TCP connection with a ZOMBIE_KERNEL
> > bit just after the kernel checkpoint (for the kernel is walking dead
> > and won't remember anything that happens), and then prevent any TCP
> > acks from being sent for those connections would be the right
> > solution.  I've taken to unplugging the physical lan cable,
> > hibernating to disk, and plugging it back in after the system is down,
> > to avoid the problem.  Any ideas?
> 
> Hmmm... sounds like taking down network interfaces before starting
> hibernation sequence should be enough, which shouldn't be too
> difficult to implement from userland.  Rafael, what do you think?

What I observe is the kernel prints out "Preallocating image memory",
then when the screen goes blank the network link light also goes out,
then the screen comes back on with "Compressing and saving" along with
the link light comes on, until it has been saved and the system shuts
down.  So the kernel is already brining the network down, it just
needs to keep it there until the original check pointed kernel is back
up.

Userspace bringing the network interfaces down is problematic.  As an
example one of my systems is running hostapd as an access point and
bridging that to the wired ethernet, that's not a trivial task to
setup and take down (the Debian ifup can set it up, but I've not
figured out yet how to get ifdown to take everything down cleanly, and
I sometimes manually run hostapd if I'm troubleshooting).  Any
manually added routes would go away, good luck in setting everything
back up the way it was before for all the different configurations out
there in userspace.  Add to those issues programs would now have a
time when networking is down that they wouldn't have otherwise seen.

-- 
David Fries <david@fries.net>    PGP pub CB1EE8F0
http://fries.net/~david/

^ permalink raw reply

* Re: Broken link in /sys/class/net/ [was: [GIT] Networking]
From: Jiri Slaby @ 2011-10-30 20:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Jiri Slaby, Greg KH, Linus Torvalds, David Miller,
	Mikulas Patocka, akpm, netdev, linux-kernel
In-Reply-To: <m11utvytg4.fsf@fess.ebiederm.org>

On 10/30/2011 05:49 AM, Eric W. Biederman wrote:
> Jiri Slaby <jslaby@suse.cz> writes:
> 
>> On 10/25/2011 03:13 PM, Greg KH wrote:
>>> On Tue, Oct 25, 2011 at 01:46:11PM +0200, Linus Torvalds wrote:
>>>> Anyway, after that rant about really bad practices, let me say that I
>>>> did fix up the conflict and I think it's right. But I won't guarantee
>>>> it, so please check the changes to fs/sysfs/dir.c.
>>>
>>> I think it looks ok, I've booted the merge result, and am typing and
>>> sending this from the new kernel, and it hasn't crashed yet :)
>>
>> Hi, maybe this was not caused by the merge, but the patch[1] causes this
>> mess in /sys/class/net/ for me:
>> l????????? ? ?    ?    ?             ? eth1
>>
>> This happens after one renames a net device -- the new name is eth1 here.
>>
>> [1] 4f72c0cab40 (sysfs: use rb-tree for name lookups)
> 
> This looks pretty fixable but today sysfs_rename does not do anything
> with the to move a renamed entry to a different position in the rbtree.
> 
> If the directory itself changes sysfs_rename should be fine, and it
> looks like a trivial patch to always apply the directory rename logic
> in sysfs_rename. 
> 
> I think all we need is something like the untested patch below to fix
> the network device rename problem.

Looks like it works. Thanks.

> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index 48ffbdf..a294068 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -865,14 +865,13 @@ int sysfs_rename(struct sysfs_dirent *sd,
>  		sd->s_name = new_name;
>  	}
>  
> -	/* Remove from old parent's list and insert into new parent's list. */
> -	if (sd->s_parent != new_parent_sd) {
> -		sysfs_unlink_sibling(sd);
> -		sysfs_get(new_parent_sd);
> -		sysfs_put(sd->s_parent);
> -		sd->s_parent = new_parent_sd;
> -		sysfs_link_sibling(sd);
> -	}
> +	/* Move to the appropriate place in the appropriate directories rbtree. */
> +	sysfs_unlink_sibling(sd);
> +	sysfs_get(new_parent_sd);
> +	sysfs_put(sd->s_parent);
> +	sd->s_parent = new_parent_sd;
> +	sysfs_link_sibling(sd);
> +
>  	sd->s_ns = new_ns;
>  
>  	error = 0;
> 


-- 
js
suse labs

^ permalink raw reply

* Re: [PATCH] batman-adv: Fix range check for expected packets
From: Marek Lindner @ 2011-10-30 20:58 UTC (permalink / raw)
  To: b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, Simon Wunderlich
In-Reply-To: <1319988163-3249-1-git-send-email-siwu-MaAgPAbsBIVS8oHt8HbXEIQuADTiUCJX@public.gmane.org>

On Sunday, October 30, 2011 16:22:43 Simon Wunderlich wrote:
> The check for new packets in the future used a wrong binary operator,
> which makes the check expression always true and accepting too many
> packets.

Applied in revision 00ca20e.

Thanks,
Marek

^ permalink raw reply

* [PATCH v2] net: add calxeda xgmac ethernet driver
From: Rob Herring @ 2011-10-30 21:30 UTC (permalink / raw)
  To: netdev, devicetree-discuss; +Cc: joe, saeed.bishara, Rob Herring

From: Rob Herring <rob.herring@calxeda.com>

Add support for the XGMAC 10Gb ethernet device in the Calxeda Highbank
SOC.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
v2:
- use __le32 for descriptor fields and cpu_to_le32/le32_to_cpu to access
- use u32 instead of dma_addr_t for descriptor phys addresses
- improve allocation error handling for descriptor ring allocations
- converted all prints to netdev_XXX
- rebase to current Linus master
- move into drivers/net/ethernet

 .../devicetree/bindings/net/calxeda-xgmac.txt      |   16 +
 drivers/net/ethernet/Kconfig                       |    1 +
 drivers/net/ethernet/Makefile                      |    1 +
 drivers/net/ethernet/calxeda/Kconfig               |    7 +
 drivers/net/ethernet/calxeda/Makefile              |    2 +
 drivers/net/ethernet/calxeda/xgmac.c               | 1928 ++++++++++++++++++++
 6 files changed, 1955 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/calxeda-xgmac.txt
 create mode 100644 drivers/net/ethernet/calxeda/Kconfig
 create mode 100644 drivers/net/ethernet/calxeda/Makefile
 create mode 100644 drivers/net/ethernet/calxeda/xgmac.c

diff --git a/Documentation/devicetree/bindings/net/calxeda-xgmac.txt b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
new file mode 100644
index 0000000..c03a7bc
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
@@ -0,0 +1,16 @@
+* Calxeda Highbank 10Gb XGMAC Ethernet
+
+Required properties:
+- compatible : Should be "calxeda,hb-xgmac"
+- reg : Address and length of the register set for the device
+- interrupts : Should contain 3 xgmac interrupts. The 1st is main interrupt.
+  The 2nd is pwr mgt interrupt. The 3rd is low power state interrupt.
+
+Example:
+
+ethernet@fff50000 {
+        compatible = "calxeda,hb-xgmac";
+        reg = <0xfff50000 0x1000>;
+        interrupts = <0 77 4  0 78 4  0 79 4>;
+};
+
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 6dff5a0..57e3fda 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -28,6 +28,7 @@ source "drivers/net/ethernet/cadence/Kconfig"
 source "drivers/net/ethernet/adi/Kconfig"
 source "drivers/net/ethernet/broadcom/Kconfig"
 source "drivers/net/ethernet/brocade/Kconfig"
+source "drivers/net/ethernet/calxeda/Kconfig"
 source "drivers/net/ethernet/chelsio/Kconfig"
 source "drivers/net/ethernet/cirrus/Kconfig"
 source "drivers/net/ethernet/cisco/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index c53ad3a..683aeb6 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_NET_ATMEL) += cadence/
 obj-$(CONFIG_NET_BFIN) += adi/
 obj-$(CONFIG_NET_VENDOR_BROADCOM) += broadcom/
 obj-$(CONFIG_NET_VENDOR_BROCADE) += brocade/
+obj-$(CONFIG_NET_CALXEDA_XGMAC) += calxeda/
 obj-$(CONFIG_NET_VENDOR_CHELSIO) += chelsio/
 obj-$(CONFIG_NET_VENDOR_CIRRUS) += cirrus/
 obj-$(CONFIG_NET_VENDOR_CISCO) += cisco/
diff --git a/drivers/net/ethernet/calxeda/Kconfig b/drivers/net/ethernet/calxeda/Kconfig
new file mode 100644
index 0000000..a52e725
--- /dev/null
+++ b/drivers/net/ethernet/calxeda/Kconfig
@@ -0,0 +1,7 @@
+config NET_CALXEDA_XGMAC
+	tristate "Calxeda 1G/10G XGMAC Ethernet driver"
+
+	select CRC32
+	help
+	  This is the driver for the XGMAC Ethernet IP block found on Calxeda
+	  Highbank platforms.
diff --git a/drivers/net/ethernet/calxeda/Makefile b/drivers/net/ethernet/calxeda/Makefile
new file mode 100644
index 0000000..5057cd2
--- /dev/null
+++ b/drivers/net/ethernet/calxeda/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_NET_CALXEDA_XGMAC) += xgmac.o
+
diff --git a/drivers/net/ethernet/calxeda/xgmac.c b/drivers/net/ethernet/calxeda/xgmac.c
new file mode 100644
index 0000000..5e93c8b
--- /dev/null
+++ b/drivers/net/ethernet/calxeda/xgmac.c
@@ -0,0 +1,1928 @@
+/*
+ * Copyright 2010-2011 Calxeda, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/circ_buf.h>
+#include <linux/interrupt.h>
+#include <linux/etherdevice.h>
+#include <linux/platform_device.h>
+#include <linux/skbuff.h>
+#include <linux/ethtool.h>
+#include <linux/if.h>
+#include <linux/crc32.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+
+/* XGMAC Register definitions */
+#define XGMAC_CONTROL		0x00000000	/* MAC Configuration */
+#define XGMAC_FRAME_FILTER	0x00000004	/* MAC Frame Filter */
+#define XGMAC_FLOW_CTRL		0x00000018	/* MAC Flow Control */
+#define XGMAC_VLAN_TAG		0x0000001C	/* VLAN Tags */
+#define XGMAC_VERSION		0x00000020	/* Version */
+#define XGMAC_VLAN_INCL		0x00000024	/* VLAN tag for tx frames */
+#define XGMAC_LPI_CTRL		0x00000028	/* LPI Control and Status */
+#define XGMAC_LPI_TIMER		0x0000002C	/* LPI Timers Control */
+#define XGMAC_TX_PACE		0x00000030	/* Transmit Pace and Stretch */
+#define XGMAC_VLAN_HASH		0x00000034	/* VLAN Hash Table */
+#define XGMAC_DEBUG		0x00000038	/* Debug */
+#define XGMAC_INT_STAT		0x0000003C	/* Interrupt and Control */
+#define XGMAC_ADDR_HIGH(reg)	(0x00000040+((reg) * 8))
+#define XGMAC_ADDR_LOW(reg)	(0x00000044+((reg) * 8))
+#define XGMAC_HASH(n)		(0x00000300 + (n) * 4) /* HASH table regs */
+#define XGMAC_NUM_HASH		16
+#define XGMAC_OMR		0x00000400
+#define XGMAC_REMOTE_WAKE	0x00000700	/* Remote Wake-Up Frm Filter */
+#define XGMAC_PMT		0x00000704	/* PMT Control and Status */
+#define XGMAC_MMC_CTRL		0x00000800	/* XGMAC MMC Control */
+#define XGMAC_MMC_INTR_RX	0x00000804	/* Recieve Interrupt */
+#define XGMAC_MMC_INTR_TX	0x00000808	/* Transmit Interrupt */
+#define XGMAC_MMC_INTR_MASK_RX	0x0000080c	/* Recieve Interrupt Mask */
+#define XGMAC_MMC_INTR_MASK_TX	0x00000810	/* Transmit Interrupt Mask */
+
+/* Hardware TX Statistics Counters */
+#define XGMAC_MMC_TXOCTET_GB_LO	0x00000814
+#define XGMAC_MMC_TXOCTET_GB_HI	0x00000818
+#define XGMAC_MMC_TXFRAME_GB_LO	0x0000081C
+#define XGMAC_MMC_TXFRAME_GB_HI	0x00000820
+#define XGMAC_MMC_TXBCFRAME_G	0x00000824
+#define XGMAC_MMC_TXMCFRAME_G	0x0000082C
+#define XGMAC_MMC_TXUCFRAME_GB	0x00000864
+#define XGMAC_MMC_TXMCFRAME_GB	0x0000086C
+#define XGMAC_MMC_TXBCFRAME_GB	0x00000874
+#define XGMAC_MMC_TXUNDERFLOW	0x0000087C
+#define XGMAC_MMC_TXOCTET_G_LO	0x00000884
+#define XGMAC_MMC_TXOCTET_G_HI	0x00000888
+#define XGMAC_MMC_TXFRAME_G_LO	0x0000088C
+#define XGMAC_MMC_TXFRAME_G_HI	0x00000890
+#define XGMAC_MMC_TXPAUSEFRAME	0x00000894
+#define XGMAC_MMC_TXVLANFRAME	0x0000089C
+
+/* Hardware RX Statistics Counters */
+#define XGMAC_MMC_RXFRAME_GB_LO	0x00000900
+#define XGMAC_MMC_RXFRAME_GB_HI	0x00000904
+#define XGMAC_MMC_RXOCTET_GB_LO	0x00000908
+#define XGMAC_MMC_RXOCTET_GB_HI	0x0000090C
+#define XGMAC_MMC_RXOCTET_G_LO	0x00000910
+#define XGMAC_MMC_RXOCTET_G_HI	0x00000914
+#define XGMAC_MMC_RXBCFRAME_G	0x00000918
+#define XGMAC_MMC_RXMCFRAME_G	0x00000920
+#define XGMAC_MMC_RXCRCERR	0x00000928
+#define XGMAC_MMC_RXRUNT	0x00000930
+#define XGMAC_MMC_RXJABBER	0x00000934
+#define XGMAC_MMC_RXUCFRAME_G	0x00000970
+#define XGMAC_MMC_RXLENGTHERR	0x00000978
+#define XGMAC_MMC_RXOVERFLOW	0x00000990
+#define XGMAC_MMC_RXPAUSEFRAME	0x00000988
+#define XGMAC_MMC_RXVLANFRAME	0x00000998
+
+/* DMA Control and Status Registers */
+#define XGMAC_DMA_BUS_MODE	0x00000f00	/* Bus Mode */
+#define XGMAC_DMA_TX_POLL	0x00000f04	/* Transmit Poll Demand */
+#define XGMAC_DMA_RX_POLL	0x00000f08	/* Received Poll Demand */
+#define XGMAC_DMA_RX_BASE_ADDR	0x00000f0c	/* Receive List Base */
+#define XGMAC_DMA_TX_BASE_ADDR	0x00000f10	/* Transmit List Base */
+#define XGMAC_DMA_STATUS	0x00000f14	/* Status Register */
+#define XGMAC_DMA_CONTROL	0x00000f18	/* Ctrl (Operational Mode) */
+#define XGMAC_DMA_INTR_ENA	0x00000f1c	/* Interrupt Enable */
+#define XGMAC_DMA_MISS_FRAME_CTR 0x00000f20	/* Missed Frame Counter */
+#define XGMAC_DMA_RI_WDOG_TIMER	0x00000f24	/* RX Intr Watchdog Timer */
+#define XGMAC_DMA_AXI_BUS	0x00000f28	/* AXI Bus Mode */
+#define XGMAC_DMA_AXI_STATUS	0x00000f2C	/* AXI Status */
+#define XGMAC_DMA_HW_FEATURE	0x00000f58	/* Enabled Hardware Features */
+
+#define XGMAC_ADDR_AE		0x80000000
+#define XGMAC_MAX_FILTER_ADDR	31
+
+/* PMT Control and Status */
+#define XGMAC_PMT_POINTER_RESET	0x80000000
+#define XGMAC_PMT_GLBL_UNICAST	0x00000200
+#define XGMAC_PMT_WAKEUP_RX_FRM	0x00000040
+#define XGMAC_PMT_MAGIC_PKT	0x00000020
+#define XGMAC_PMT_WAKEUP_FRM_EN	0x00000004
+#define XGMAC_PMT_MAGIC_PKT_EN	0x00000002
+#define XGMAC_PMT_POWERDOWN	0x00000001
+
+#define XGMAC_CONTROL_SPD	0x40000000	/* Speed control */
+#define XGMAC_CONTROL_SPD_MASK	0x60000000
+#define XGMAC_CONTROL_SPD_1G	0x60000000
+#define XGMAC_CONTROL_SPD_2_5G	0x40000000
+#define XGMAC_CONTROL_SPD_10G	0x00000000
+#define XGMAC_CONTROL_SARC	0x10000000	/* Source Addr Insert/Replace */
+#define XGMAC_CONTROL_SARK_MASK	0x18000000
+#define XGMAC_CONTROL_CAR	0x04000000	/* CRC Addition/Replacement */
+#define XGMAC_CONTROL_CAR_MASK	0x06000000
+#define XGMAC_CONTROL_DP	0x01000000	/* Disable Padding */
+#define XGMAC_CONTROL_WD	0x00800000	/* Disable Watchdog on rx */
+#define XGMAC_CONTROL_JD	0x00400000	/* Jabber disable */
+#define XGMAC_CONTROL_JE	0x00100000	/* Jumbo frame */
+#define XGMAC_CONTROL_LM	0x00001000	/* Loop-back mode */
+#define XGMAC_CONTROL_IPC	0x00000400	/* Checksum Offload */
+#define XGMAC_CONTROL_ACS	0x00000080	/* Automatic Pad/FCS Strip */
+#define XGMAC_CONTROL_DDIC	0x00000010	/* Disable Deficit Idle Count */
+#define XGMAC_CONTROL_TE	0x00000008	/* Transmitter Enable */
+#define XGMAC_CONTROL_RE	0x00000004	/* Receiver Enable */
+
+/* XGMAC Frame Filter defines */
+#define XGMAC_FRAME_FILTER_PR	0x00000001	/* Promiscuous Mode */
+#define XGMAC_FRAME_FILTER_HUC	0x00000002	/* Hash Unicast */
+#define XGMAC_FRAME_FILTER_HMC	0x00000004	/* Hash Multicast */
+#define XGMAC_FRAME_FILTER_DAIF	0x00000008	/* DA Inverse Filtering */
+#define XGMAC_FRAME_FILTER_PM	0x00000010	/* Pass all multicast */
+#define XGMAC_FRAME_FILTER_DBF	0x00000020	/* Disable Broadcast frames */
+#define XGMAC_FRAME_FILTER_SAIF	0x00000100	/* Inverse Filtering */
+#define XGMAC_FRAME_FILTER_SAF	0x00000200	/* Source Address Filter */
+#define XGMAC_FRAME_FILTER_HPF	0x00000400	/* Hash or perfect Filter */
+#define XGMAC_FRAME_FILTER_VHF	0x00000800	/* VLAN Hash Filter */
+#define XGMAC_FRAME_FILTER_VPF	0x00001000	/* VLAN Perfect Filter */
+#define XGMAC_FRAME_FILTER_RA	0x80000000	/* Receive all mode */
+
+/* XGMAC FLOW CTRL defines */
+#define XGMAC_FLOW_CTRL_PT_MASK	0xffff0000	/* Pause Time Mask */
+#define XGMAC_FLOW_CTRL_PT_SHIFT	16
+#define XGMAC_FLOW_CTRL_DZQP	0x00000080	/* Disable Zero-Quanta Phase */
+#define XGMAC_FLOW_CTRL_PLT	0x00000020	/* Pause Low Threshhold */
+#define XGMAC_FLOW_CTRL_PLT_MASK 0x00000030	/* PLT MASK */
+#define XGMAC_FLOW_CTRL_UP	0x00000008	/* Unicast Pause Frame Detect */
+#define XGMAC_FLOW_CTRL_RFE	0x00000004	/* Rx Flow Control Enable */
+#define XGMAC_FLOW_CTRL_TFE	0x00000002	/* Tx Flow Control Enable */
+#define XGMAC_FLOW_CTRL_FCB_BPA	0x00000001	/* Flow Control Busy ... */
+
+/* XGMAC_INT_STAT reg */
+#define XGMAC_INT_STAT_PMT	0x0080		/* PMT Interrupt Status */
+#define XGMAC_INT_STAT_LPI	0x0040		/* LPI Interrupt Status */
+
+/* DMA Bus Mode register defines */
+#define DMA_BUS_MODE_SFT_RESET	0x00000001	/* Software Reset */
+#define DMA_BUS_MODE_DSL_MASK	0x0000007c	/* Descriptor Skip Length */
+#define DMA_BUS_MODE_DSL_SHIFT	2		/* (in DWORDS) */
+#define DMA_BUS_MODE_ATDS	0x00000080	/* Alternate Descriptor Size */
+
+/* Programmable burst length */
+#define DMA_BUS_MODE_PBL_MASK	0x00003f00	/* Programmable Burst Len */
+#define DMA_BUS_MODE_PBL_SHIFT	8
+#define DMA_BUS_MODE_FB		0x00010000	/* Fixed burst */
+#define DMA_BUS_MODE_RPBL_MASK	0x003e0000	/* Rx-Programmable Burst Len */
+#define DMA_BUS_MODE_RPBL_SHIFT	17
+#define DMA_BUS_MODE_USP	0x00800000
+#define DMA_BUS_MODE_8PBL	0x01000000
+#define DMA_BUS_MODE_AAL	0x02000000
+
+/* DMA Bus Mode register defines */
+#define DMA_BUS_PR_RATIO_MASK	0x0000c000	/* Rx/Tx priority ratio */
+#define DMA_BUS_PR_RATIO_SHIFT	14
+#define DMA_BUS_FB		0x00010000	/* Fixed Burst */
+
+/* DMA Control register defines */
+#define DMA_CONTROL_ST		0x00002000	/* Start/Stop Transmission */
+#define DMA_CONTROL_SR		0x00000002	/* Start/Stop Receive */
+#define DMA_CONTROL_DFF		0x01000000	/* Disable flush of rx frames */
+
+/* DMA Normal interrupt */
+#define DMA_INTR_ENA_NIE	0x00010000	/* Normal Summary */
+#define DMA_INTR_ENA_AIE	0x00008000	/* Abnormal Summary */
+#define DMA_INTR_ENA_ERE	0x00004000	/* Early Receive */
+#define DMA_INTR_ENA_FBE	0x00002000	/* Fatal Bus Error */
+#define DMA_INTR_ENA_ETE	0x00000400	/* Early Transmit */
+#define DMA_INTR_ENA_RWE	0x00000200	/* Receive Watchdog */
+#define DMA_INTR_ENA_RSE	0x00000100	/* Receive Stopped */
+#define DMA_INTR_ENA_RUE	0x00000080	/* Receive Buffer Unavailable */
+#define DMA_INTR_ENA_RIE	0x00000040	/* Receive Interrupt */
+#define DMA_INTR_ENA_UNE	0x00000020	/* Tx Underflow */
+#define DMA_INTR_ENA_OVE	0x00000010	/* Receive Overflow */
+#define DMA_INTR_ENA_TJE	0x00000008	/* Transmit Jabber */
+#define DMA_INTR_ENA_TUE	0x00000004	/* Transmit Buffer Unavail */
+#define DMA_INTR_ENA_TSE	0x00000002	/* Transmit Stopped */
+#define DMA_INTR_ENA_TIE	0x00000001	/* Transmit Interrupt */
+
+#define DMA_INTR_NORMAL		(DMA_INTR_ENA_NIE | DMA_INTR_ENA_RIE | \
+				 DMA_INTR_ENA_TUE)
+
+#define DMA_INTR_ABNORMAL	(DMA_INTR_ENA_AIE | DMA_INTR_ENA_FBE | \
+				 DMA_INTR_ENA_RWE | DMA_INTR_ENA_RSE | \
+				 DMA_INTR_ENA_RUE | DMA_INTR_ENA_UNE | \
+				 DMA_INTR_ENA_OVE | DMA_INTR_ENA_TJE | \
+				 DMA_INTR_ENA_TSE)
+
+/* DMA default interrupt mask */
+#define DMA_INTR_DEFAULT_MASK	(DMA_INTR_NORMAL | DMA_INTR_ABNORMAL)
+
+/* DMA Status register defines */
+#define DMA_STATUS_GMI		0x08000000	/* MMC interrupt */
+#define DMA_STATUS_GLI		0x04000000	/* GMAC Line interface int */
+#define DMA_STATUS_EB_MASK	0x00380000	/* Error Bits Mask */
+#define DMA_STATUS_EB_TX_ABORT	0x00080000	/* Error Bits - TX Abort */
+#define DMA_STATUS_EB_RX_ABORT	0x00100000	/* Error Bits - RX Abort */
+#define DMA_STATUS_TS_MASK	0x00700000	/* Transmit Process State */
+#define DMA_STATUS_TS_SHIFT	20
+#define DMA_STATUS_RS_MASK	0x000e0000	/* Receive Process State */
+#define DMA_STATUS_RS_SHIFT	17
+#define DMA_STATUS_NIS		0x00010000	/* Normal Interrupt Summary */
+#define DMA_STATUS_AIS		0x00008000	/* Abnormal Interrupt Summary */
+#define DMA_STATUS_ERI		0x00004000	/* Early Receive Interrupt */
+#define DMA_STATUS_FBI		0x00002000	/* Fatal Bus Error Interrupt */
+#define DMA_STATUS_ETI		0x00000400	/* Early Transmit Interrupt */
+#define DMA_STATUS_RWT		0x00000200	/* Receive Watchdog Timeout */
+#define DMA_STATUS_RPS		0x00000100	/* Receive Process Stopped */
+#define DMA_STATUS_RU		0x00000080	/* Receive Buffer Unavailable */
+#define DMA_STATUS_RI		0x00000040	/* Receive Interrupt */
+#define DMA_STATUS_UNF		0x00000020	/* Transmit Underflow */
+#define DMA_STATUS_OVF		0x00000010	/* Receive Overflow */
+#define DMA_STATUS_TJT		0x00000008	/* Transmit Jabber Timeout */
+#define DMA_STATUS_TU		0x00000004	/* Transmit Buffer Unavail */
+#define DMA_STATUS_TPS		0x00000002	/* Transmit Process Stopped */
+#define DMA_STATUS_TI		0x00000001	/* Transmit Interrupt */
+
+/* Common MAC defines */
+#define MAC_ENABLE_TX		0x00000008	/* Transmitter Enable */
+#define MAC_ENABLE_RX		0x00000004	/* Receiver Enable */
+
+/* XGMAC Operation Mode Register */
+#define XGMAC_OMR_TSF		0x00200000	/* TX FIFO Store and Forward */
+#define XGMAC_OMR_FTF		0x00100000	/* Flush Transmit FIFO */
+#define XGMAC_OMR_TTC		0x00020000	/* Transmit Threshhold Ctrl */
+#define XGMAC_OMR_TTC_MASK	0x00030000
+#define XGMAC_OMR_RFD		0x00006000	/* FC Deactivation Threshhold */
+#define XGMAC_OMR_RFD_MASK	0x00007000	/* FC Deact Threshhold MASK */
+#define XGMAC_OMR_RFA		0x00000600	/* FC Activation Threshhold */
+#define XGMAC_OMR_RFA_MASK	0x00000E00	/* FC Act Threshhold MASK */
+#define XGMAC_OMR_EFC		0x00000100	/* Enable Hardware FC */
+#define XGMAC_OMR_FEF		0x00000080	/* Forward Error Frames */
+#define XGMAC_OMR_DT		0x00000040	/* Drop TCP/IP csum Errors */
+#define XGMAC_OMR_RSF		0x00000020	/* RX FIFO Store and Forward */
+#define XGMAC_OMR_RTC		0x00000010	/* RX Threshhold Ctrl */
+#define XGMAC_OMR_RTC_MASK	0x00000018	/* RX Threshhold Ctrl MASK */
+
+/* XGMAC HW Features Register */
+#define DMA_HW_FEAT_TXCOESEL	0x00010000	/* TX Checksum offload */
+
+/* XGMAC Descriptor Defines */
+#define MAX_DESC_BUF_SZ		(SZ_8K - 8)
+
+#define RXDESC_EXT_STATUS	0x00000001
+#define RXDESC_CRC_ERR		0x00000002
+#define RXDESC_RX_ERR		0x00000008
+#define RXDESC_RX_WDOG		0x00000010
+#define RXDESC_FRAME_TYPE	0x00000020
+#define RXDESC_GIANT_FRAME	0x00000080
+#define RXDESC_LAST_SEG		0x00000100
+#define RXDESC_FIRST_SEG	0x00000200
+#define RXDESC_VLAN_FRAME	0x00000400
+#define RXDESC_OVERFLOW_ERR	0x00000800
+#define RXDESC_LENGTH_ERR	0x00001000
+#define RXDESC_SA_FILTER_FAIL	0x00002000
+#define RXDESC_DESCRIPTOR_ERR	0x00004000
+#define RXDESC_ERROR_SUMMARY	0x00008000
+#define RXDESC_FRAME_LEN_OFFSET	16
+#define RXDESC_FRAME_LEN_MASK	0x3fff0000
+#define RXDESC_DA_FILTER_FAIL	0x40000000
+
+#define RXDESC1_END_RING	0x00008000
+
+#define RXDESC_IP_PAYLOAD_MASK	0x00000003
+#define RXDESC_IP_PAYLOAD_UDP	0x00000001
+#define RXDESC_IP_PAYLOAD_TCP	0x00000002
+#define RXDESC_IP_PAYLOAD_ICMP	0x00000003
+#define RXDESC_IP_HEADER_ERR	0x00000008
+#define RXDESC_IP_PAYLOAD_ERR	0x00000010
+#define RXDESC_IPV4_PACKET	0x00000040
+#define RXDESC_IPV6_PACKET	0x00000080
+#define TXDESC_UNDERFLOW_ERR	0x00000001
+#define TXDESC_JABBER_TIMEOUT	0x00000002
+#define TXDESC_LOCAL_FAULT	0x00000004
+#define TXDESC_REMOTE_FAULT	0x00000008
+#define TXDESC_VLAN_FRAME	0x00000010
+#define TXDESC_FRAME_FLUSHED	0x00000020
+#define TXDESC_IP_HEADER_ERR	0x00000040
+#define TXDESC_PAYLOAD_CSUM_ERR	0x00000080
+#define TXDESC_ERROR_SUMMARY	0x00008000
+#define TXDESC_SA_CTRL_INSERT	0x00040000
+#define TXDESC_SA_CTRL_REPLACE	0x00080000
+#define TXDESC_2ND_ADDR_CHAINED	0x00100000
+#define TXDESC_END_RING		0x00200000
+#define TXDESC_CSUM_IP		0x00400000
+#define TXDESC_CSUM_IP_PAYLD	0x00800000
+#define TXDESC_CSUM_ALL		0x00C00000
+#define TXDESC_CRC_EN_REPLACE	0x01000000
+#define TXDESC_CRC_EN_APPEND	0x02000000
+#define TXDESC_DISABLE_PAD	0x04000000
+#define TXDESC_FIRST_SEG	0x10000000
+#define TXDESC_LAST_SEG		0x20000000
+#define TXDESC_INTERRUPT	0x40000000
+
+#define DESC_OWN		0x80000000
+#define DESC_BUFFER1_SZ_MASK	0x00001fff
+#define DESC_BUFFER2_SZ_MASK	0x1fff0000
+#define DESC_BUFFER2_SZ_OFFSET	16
+
+struct xgmac_dma_desc {
+	__le32 flags;
+	__le32 buf_size;
+	__le32 buf1_addr;		/* Buffer 1 Address Pointer */
+	__le32 buf2_addr;		/* Buffer 2 Address Pointer */
+	__le32 ext_status;
+	__le32 res[3];
+};
+
+struct xgmac_extra_stats {
+	/* Transmit errors */
+	unsigned long tx_jabber;
+	unsigned long tx_frame_flushed;
+	unsigned long tx_payload_error;
+	unsigned long tx_ip_header_error;
+	unsigned long tx_local_fault;
+	unsigned long tx_remote_fault;
+	/* Receive errors */
+	unsigned long rx_watchdog;
+	unsigned long da_rx_filter_fail;
+	unsigned long sa_rx_filter_fail;
+	unsigned long rx_missed_cntr;
+	unsigned long rx_overflow_cntr;
+	unsigned long rx_payload_error;
+	unsigned long rx_ip_header_error;
+	/* Tx/Rx IRQ errors */
+	unsigned long tx_undeflow_irq;
+	unsigned long tx_process_stopped_irq;
+	unsigned long tx_jabber_irq;
+	unsigned long rx_overflow_irq;
+	unsigned long rx_buf_unav_irq;
+	unsigned long rx_process_stopped_irq;
+	unsigned long rx_watchdog_irq;
+	unsigned long tx_early_irq;
+	unsigned long fatal_bus_error_irq;
+};
+
+struct xgmac_priv {
+	struct xgmac_dma_desc *dma_rx;
+	struct sk_buff **rx_skbuff;
+	unsigned int rx_tail;
+	unsigned int rx_head;
+
+	struct xgmac_dma_desc *dma_tx;
+	struct sk_buff **tx_skbuff;
+	unsigned int tx_head;
+	unsigned int tx_tail;
+
+	void __iomem *base;
+	struct sk_buff_head rx_recycle;
+	unsigned int dma_buf_sz;
+	dma_addr_t dma_rx_phy;
+	dma_addr_t dma_tx_phy;
+
+	struct net_device *dev;
+	struct device *device;
+	struct napi_struct napi;
+
+	struct xgmac_extra_stats xstats;
+
+	int pmt_irq;
+	char rx_pause;
+	char tx_pause;
+	int wolopts;
+};
+
+/* XGMAC Configuration Settings */
+#define MAX_MTU			9000
+#define PAUSE_TIME		0x400
+
+#define DMA_RX_RING_SZ		256
+#define DMA_TX_RING_SZ		128
+/* minimum number of free TX descriptors required to wake up TX process */
+#define TX_THRESH		(DMA_TX_RING_SZ/4)
+
+/* DMA descriptor ring helpers */
+#define dma_ring_incr(n, s)	(((n) + 1) & ((s) - 1))
+#define dma_ring_space(h, t, s)	CIRC_SPACE(h, t, s)
+#define dma_ring_cnt(h, t, s)	CIRC_CNT(h, t, s)
+
+/* XGMAC Descriptor Access Helpers */
+static inline void desc_set_buf_len(struct xgmac_dma_desc *p, u32 buf_sz)
+{
+	if (buf_sz > MAX_DESC_BUF_SZ)
+		p->buf_size = cpu_to_le32(MAX_DESC_BUF_SZ |
+			(buf_sz - MAX_DESC_BUF_SZ) << DESC_BUFFER2_SZ_OFFSET);
+	else
+		p->buf_size = cpu_to_le32(buf_sz);
+}
+
+static inline int desc_get_buf_len(struct xgmac_dma_desc *p)
+{
+	u32 len = cpu_to_le32(p->flags);
+	return (len & DESC_BUFFER1_SZ_MASK) +
+		((len & DESC_BUFFER2_SZ_MASK) >> DESC_BUFFER2_SZ_OFFSET);
+}
+
+static inline void desc_init_rx_desc(struct xgmac_dma_desc *p, int ring_size,
+				     int buf_sz)
+{
+	struct xgmac_dma_desc *end = p + ring_size - 1;
+
+	memset(p, 0, sizeof(*p) * ring_size);
+
+	for (; p <= end; p++)
+		desc_set_buf_len(p, buf_sz);
+
+	end->buf_size |= cpu_to_le32(RXDESC1_END_RING);
+}
+
+static inline void desc_init_tx_desc(struct xgmac_dma_desc *p, u32 ring_size)
+{
+	memset(p, 0, sizeof(*p) * ring_size);
+	p[ring_size - 1].flags = cpu_to_le32(TXDESC_END_RING);
+}
+
+static inline int desc_get_owner(struct xgmac_dma_desc *p)
+{
+	return le32_to_cpu(p->flags) & DESC_OWN;
+}
+
+static inline void desc_set_rx_owner(struct xgmac_dma_desc *p)
+{
+	/* Clear all fields and set the owner */
+	p->flags = cpu_to_le32(DESC_OWN);
+}
+
+static inline void desc_set_tx_owner(struct xgmac_dma_desc *p, u32 flags)
+{
+	u32 tmpflags = le32_to_cpu(p->flags);
+	tmpflags &= TXDESC_END_RING;
+	tmpflags |= flags | DESC_OWN;
+	p->flags = cpu_to_le32(tmpflags);
+}
+
+static inline int desc_get_tx_ls(struct xgmac_dma_desc *p)
+{
+	return le32_to_cpu(p->flags) & TXDESC_LAST_SEG;
+}
+
+static inline u32 desc_get_buf_addr(struct xgmac_dma_desc *p)
+{
+	return le32_to_cpu(p->buf1_addr);
+}
+
+static inline void desc_set_buf_addr(struct xgmac_dma_desc *p,
+				     u32 paddr, int len)
+{
+	p->buf1_addr = cpu_to_le32(paddr);
+	if (len > MAX_DESC_BUF_SZ)
+		p->buf2_addr = cpu_to_le32(paddr + MAX_DESC_BUF_SZ);
+}
+
+static inline void desc_set_buf_addr_and_size(struct xgmac_dma_desc *p,
+					      u32 paddr, int len)
+{
+	desc_set_buf_len(p, len);
+	desc_set_buf_addr(p, paddr, len);
+}
+
+static inline int desc_get_rx_frame_len(struct xgmac_dma_desc *p)
+{
+	u32 data = le32_to_cpu(p->flags);
+	u32 len = (data & RXDESC_FRAME_LEN_MASK) >> RXDESC_FRAME_LEN_OFFSET;
+	if (data & RXDESC_FRAME_TYPE)
+		len -= ETH_FCS_LEN;
+
+	return len;
+}
+
+static void xgmac_dma_flush_tx_fifo(void __iomem *ioaddr)
+{
+	int timeout = 1000;
+	u32 reg = readl(ioaddr + XGMAC_OMR);
+	writel(reg | XGMAC_OMR_FTF, ioaddr + XGMAC_OMR);
+
+	while ((timeout-- > 0) && readl(ioaddr + XGMAC_OMR) & XGMAC_OMR_FTF)
+		udelay(1);
+}
+
+static int desc_get_tx_status(struct xgmac_priv *priv, struct xgmac_dma_desc *p)
+{
+	struct xgmac_extra_stats *x = &priv->xstats;
+	u32 status = le32_to_cpu(p->flags);
+
+	if (!(status & TXDESC_ERROR_SUMMARY))
+		return 0;
+
+	netdev_dbg(priv->dev, "tx desc error = 0x%08x\n", status);
+	if (status & TXDESC_JABBER_TIMEOUT)
+		x->tx_jabber++;
+	if (status & TXDESC_FRAME_FLUSHED)
+		x->tx_frame_flushed++;
+	if (status & TXDESC_UNDERFLOW_ERR)
+		xgmac_dma_flush_tx_fifo(priv->base);
+	if (status & TXDESC_IP_HEADER_ERR)
+		x->tx_ip_header_error++;
+	if (status & TXDESC_LOCAL_FAULT)
+		x->tx_local_fault++;
+	if (status & TXDESC_REMOTE_FAULT)
+		x->tx_remote_fault++;
+	if (status & TXDESC_PAYLOAD_CSUM_ERR)
+		x->tx_payload_error++;
+
+	return -1;
+}
+
+static int desc_get_rx_status(struct xgmac_priv *priv, struct xgmac_dma_desc *p)
+{
+	struct xgmac_extra_stats *x = &priv->xstats;
+	int ret = CHECKSUM_UNNECESSARY;
+	u32 status = le32_to_cpu(p->flags);
+	u32 ext_status = le32_to_cpu(p->ext_status);
+
+	if (status & RXDESC_DA_FILTER_FAIL) {
+		netdev_dbg(priv->dev, "XGMAC RX : Dest Address filter fail\n");
+		x->da_rx_filter_fail++;
+		return -1;
+	}
+
+	/* Check if packet has checksum already */
+	if ((status & RXDESC_FRAME_TYPE) && (status & RXDESC_EXT_STATUS) &&
+		!(ext_status & RXDESC_IP_PAYLOAD_MASK))
+		ret = CHECKSUM_NONE;
+
+	netdev_dbg(priv->dev, "rx status - frame type=%d, csum = %d, ext stat %08x\n",
+		   (status & RXDESC_FRAME_TYPE) ? 1 : 0, ret, ext_status);
+
+	if (!(status & RXDESC_ERROR_SUMMARY))
+		return ret;
+
+	/* Handle any errors */
+	if (status & (RXDESC_DESCRIPTOR_ERR | RXDESC_OVERFLOW_ERR |
+		RXDESC_GIANT_FRAME | RXDESC_LENGTH_ERR | RXDESC_CRC_ERR))
+		return -1;
+
+	if (status & RXDESC_EXT_STATUS) {
+		if (ext_status & RXDESC_IP_HEADER_ERR)
+			x->rx_ip_header_error++;
+		if (ext_status & RXDESC_IP_PAYLOAD_ERR)
+			x->rx_payload_error++;
+		netdev_dbg(priv->dev, "IP checksum error - stat %08x\n",
+			   ext_status);
+		return -1;
+	}
+
+	return ret;
+}
+
+static inline void xgmac_mac_enable(void __iomem *ioaddr)
+{
+	u32 value = readl(ioaddr + XGMAC_CONTROL);
+	value |= MAC_ENABLE_RX | MAC_ENABLE_TX;
+	writel(value, ioaddr + XGMAC_CONTROL);
+
+	value = readl(ioaddr + XGMAC_DMA_CONTROL);
+	value |= DMA_CONTROL_ST | DMA_CONTROL_SR;
+	writel(value, ioaddr + XGMAC_DMA_CONTROL);
+}
+
+static inline void xgmac_mac_disable(void __iomem *ioaddr)
+{
+	u32 value = readl(ioaddr + XGMAC_DMA_CONTROL);
+	value &= ~(DMA_CONTROL_ST | DMA_CONTROL_SR);
+	writel(value, ioaddr + XGMAC_DMA_CONTROL);
+
+	value = readl(ioaddr + XGMAC_CONTROL);
+	value &= ~(MAC_ENABLE_TX | MAC_ENABLE_RX);
+	writel(value, ioaddr + XGMAC_CONTROL);
+}
+
+static void xgmac_set_mac_addr(void __iomem *ioaddr, unsigned char *addr,
+			       int num)
+{
+	u32 data;
+
+	data = (addr[5] << 8) | addr[4] | (num ? XGMAC_ADDR_AE : 0);
+	writel(data, ioaddr + XGMAC_ADDR_HIGH(num));
+	data = (addr[3] << 24) | (addr[2] << 16) | (addr[1] << 8) | addr[0];
+	writel(data, ioaddr + XGMAC_ADDR_LOW(num));
+}
+
+static void xgmac_get_mac_addr(void __iomem *ioaddr, unsigned char *addr,
+			       int num)
+{
+	u32 hi_addr, lo_addr;
+
+	/* Read the MAC address from the hardware */
+	hi_addr = readl(ioaddr + XGMAC_ADDR_HIGH(num));
+	lo_addr = readl(ioaddr + XGMAC_ADDR_LOW(num));
+
+	/* Extract the MAC address from the high and low words */
+	addr[0] = lo_addr & 0xff;
+	addr[1] = (lo_addr >> 8) & 0xff;
+	addr[2] = (lo_addr >> 16) & 0xff;
+	addr[3] = (lo_addr >> 24) & 0xff;
+	addr[4] = hi_addr & 0xff;
+	addr[5] = (hi_addr >> 8) & 0xff;
+}
+
+static int xgmac_set_flow_ctrl(struct xgmac_priv *priv, int rx, int tx)
+{
+	u32 reg;
+	unsigned int flow = 0;
+
+	priv->rx_pause = rx;
+	priv->tx_pause = tx;
+
+	if (rx || tx) {
+		if (rx)
+			flow |= XGMAC_FLOW_CTRL_RFE;
+		if (tx)
+			flow |= XGMAC_FLOW_CTRL_TFE;
+
+		flow |= XGMAC_FLOW_CTRL_PLT | XGMAC_FLOW_CTRL_UP;
+		flow |= (PAUSE_TIME << XGMAC_FLOW_CTRL_PT_SHIFT);
+
+		writel(flow, priv->base + XGMAC_FLOW_CTRL);
+
+		reg = readl(priv->base + XGMAC_OMR);
+		reg |= XGMAC_OMR_EFC;
+		writel(reg, priv->base + XGMAC_OMR);
+	} else {
+		writel(0, priv->base + XGMAC_FLOW_CTRL);
+
+		reg = readl(priv->base + XGMAC_OMR);
+		reg &= ~XGMAC_OMR_EFC;
+		writel(reg, priv->base + XGMAC_OMR);
+	}
+
+	return 0;
+}
+
+static void xgmac_rx_refill(struct xgmac_priv *priv)
+{
+	struct xgmac_dma_desc *p;
+	dma_addr_t paddr;
+
+	while (dma_ring_space(priv->rx_head, priv->rx_tail, DMA_RX_RING_SZ) > 1) {
+		int entry = priv->rx_head;
+		struct sk_buff *skb;
+
+		p = priv->dma_rx + entry;
+
+		if (priv->rx_skbuff[entry] != NULL)
+			continue;
+
+		skb = __skb_dequeue(&priv->rx_recycle);
+		if (skb == NULL)
+			skb = netdev_alloc_skb(priv->dev, priv->dma_buf_sz);
+		if (unlikely(skb == NULL))
+			break;
+
+		priv->rx_skbuff[entry] = skb;
+		paddr = dma_map_single(priv->device, skb->data,
+					 priv->dma_buf_sz, DMA_FROM_DEVICE);
+		desc_set_buf_addr(p, paddr, priv->dma_buf_sz);
+
+		netdev_dbg(priv->dev, "rx ring: head %d, tail %d\n",
+			priv->rx_head, priv->rx_tail);
+
+		priv->rx_head = dma_ring_incr(priv->rx_head, DMA_RX_RING_SZ);
+		/* Ensure descriptor is in memory before handing to h/w */
+		wmb();
+		desc_set_rx_owner(p);
+	}
+}
+
+/**
+ * init_xgmac_dma_desc_rings - init the RX/TX descriptor rings
+ * @dev: net device structure
+ * Description:  this function initializes the DMA RX/TX descriptors
+ * and allocates the socket buffers.
+ */
+static int xgmac_dma_desc_rings_init(struct net_device *dev)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	unsigned int bfsize;
+
+	/* Set the Buffer size according to the MTU;
+	 * indeed, in case of jumbo we need to bump-up the buffer sizes.
+	 */
+	bfsize = ALIGN(dev->mtu + ETH_HLEN + ETH_FCS_LEN + NET_IP_ALIGN + 64,
+		       64);
+
+	netdev_dbg(priv->dev, "mtu [%d] bfsize [%d]\n", dev->mtu, bfsize);
+
+	priv->rx_skbuff = kzalloc(sizeof(struct sk_buff *) * DMA_RX_RING_SZ,
+				  GFP_KERNEL);
+	if (!priv->rx_skbuff)
+		return -ENOMEM;
+
+	priv->dma_rx = dma_alloc_coherent(priv->device,
+					  DMA_RX_RING_SZ *
+					  sizeof(struct xgmac_dma_desc),
+					  &priv->dma_rx_phy,
+					  GFP_KERNEL);
+	if (!priv->dma_rx)
+		goto err_dma_rx;
+
+	priv->tx_skbuff = kzalloc(sizeof(struct sk_buff *) * DMA_TX_RING_SZ,
+				  GFP_KERNEL);
+	if (!priv->tx_skbuff)
+		goto err_tx_skb;
+
+	priv->dma_tx = dma_alloc_coherent(priv->device,
+					  DMA_TX_RING_SZ *
+					  sizeof(struct xgmac_dma_desc),
+					  &priv->dma_tx_phy,
+					  GFP_KERNEL);
+	if (!priv->dma_tx)
+		goto err_dma_tx;
+
+	netdev_dbg(priv->dev, "DMA desc rings: virt addr (Rx %p, "
+	    "Tx %p)\n\tDMA phy addr (Rx 0x%08x, Tx 0x%08x)\n",
+	    priv->dma_rx, priv->dma_tx,
+	    (unsigned int)priv->dma_rx_phy, (unsigned int)priv->dma_tx_phy);
+
+	priv->rx_tail = 0;
+	priv->rx_head = 0;
+	priv->dma_buf_sz = bfsize;
+	desc_init_rx_desc(priv->dma_rx, DMA_RX_RING_SZ, priv->dma_buf_sz);
+	xgmac_rx_refill(priv);
+
+	priv->tx_tail = 0;
+	priv->tx_head = 0;
+	desc_init_tx_desc(priv->dma_tx, DMA_TX_RING_SZ);
+
+	/* The base address of the RX/TX descriptor lists must be written into
+	 * DMA CSR3 and CSR4, respectively. */
+	writel(priv->dma_tx_phy, priv->base + XGMAC_DMA_TX_BASE_ADDR);
+	writel(priv->dma_rx_phy, priv->base + XGMAC_DMA_RX_BASE_ADDR);
+
+	return 0;
+
+err_dma_tx:
+	kfree(priv->tx_skbuff);
+err_tx_skb:
+	dma_free_coherent(priv->device,
+			  DMA_RX_RING_SZ * sizeof(struct xgmac_dma_desc),
+			  priv->dma_rx, priv->dma_rx_phy);
+err_dma_rx:
+	kfree(priv->rx_skbuff);
+	return -ENOMEM;
+}
+
+static void xgmac_free_rx_skbufs(struct xgmac_priv *priv)
+{
+	int i;
+	struct xgmac_dma_desc *p;
+
+	for (i = 0; i < DMA_RX_RING_SZ; i++) {
+		if (priv->rx_skbuff[i] == NULL)
+			continue;
+
+		p = priv->dma_rx + i;
+		dma_unmap_single(priv->device, desc_get_buf_addr(p),
+				 priv->dma_buf_sz, DMA_FROM_DEVICE);
+		dev_kfree_skb_any(priv->rx_skbuff[i]);
+		priv->rx_skbuff[i] = NULL;
+	}
+}
+
+static void xgmac_free_tx_skbufs(struct xgmac_priv *priv)
+{
+	int i;
+	struct xgmac_dma_desc *p;
+
+	for (i = 0; i < DMA_TX_RING_SZ; i++) {
+		if (priv->tx_skbuff[i] == NULL)
+			continue;
+
+		p = priv->dma_tx + i;
+		dma_unmap_single(priv->device, desc_get_buf_addr(p),
+				 desc_get_buf_len(p), DMA_TO_DEVICE);
+		dev_kfree_skb_any(priv->tx_skbuff[i]);
+		priv->tx_skbuff[i] = NULL;
+	}
+}
+
+static void xgmac_free_dma_desc_rings(struct xgmac_priv *priv)
+{
+	/* Release the DMA TX/RX socket buffers */
+	xgmac_free_rx_skbufs(priv);
+	xgmac_free_tx_skbufs(priv);
+
+	/* Free the consistent memory allocated for descriptor rings */
+	dma_free_coherent(priv->device,
+			  DMA_TX_RING_SZ * sizeof(struct xgmac_dma_desc),
+			  priv->dma_tx, priv->dma_tx_phy);
+	priv->dma_tx = NULL;
+	dma_free_coherent(priv->device,
+			  DMA_RX_RING_SZ * sizeof(struct xgmac_dma_desc),
+			  priv->dma_rx, priv->dma_rx_phy);
+	priv->dma_rx = NULL;
+
+	kfree(priv->rx_skbuff);
+	priv->rx_skbuff = NULL;
+	kfree(priv->tx_skbuff);
+	priv->tx_skbuff = NULL;
+}
+
+/**
+ * xgmac_tx:
+ * @priv: private driver structure
+ * Description: it reclaims resources after transmission completes.
+ */
+static void xgmac_tx_complete(struct xgmac_priv *priv)
+{
+	void __iomem *ioaddr = priv->base;
+
+
+	writel(DMA_STATUS_TU | DMA_STATUS_NIS, ioaddr + XGMAC_DMA_STATUS);
+
+	while (dma_ring_cnt(priv->tx_head, priv->tx_tail, DMA_TX_RING_SZ)) {
+		unsigned int entry = priv->tx_tail;
+		struct sk_buff *skb = priv->tx_skbuff[entry];
+		struct xgmac_dma_desc *p = priv->dma_tx + entry;
+
+		/* Check if the descriptor is owned by the DMA. */
+		if (desc_get_owner(p))
+			break;
+
+		/* Verify tx error by looking at the last segment */
+		if (desc_get_tx_ls(p))
+			desc_get_tx_status(priv, p);
+
+		netdev_dbg(priv->dev, "tx ring: curr %d, dirty %d\n",
+			priv->tx_head, priv->tx_tail);
+
+		dma_unmap_single(priv->device, desc_get_buf_addr(p),
+				 desc_get_buf_len(p), DMA_TO_DEVICE);
+
+		if (skb) {
+			/*
+			 * If there's room in the queue (limit it to size)
+			 * we add this skb back into the pool,
+			 * if it's the right size.
+			 */
+			if ((skb_queue_len(&priv->rx_recycle) <
+				DMA_RX_RING_SZ) &&
+				skb_recycle_check(skb, priv->dma_buf_sz))
+				__skb_queue_head(&priv->rx_recycle, skb);
+			else
+				dev_kfree_skb(skb);
+
+			priv->tx_skbuff[entry] = NULL;
+		}
+
+		priv->tx_tail = dma_ring_incr(priv->tx_tail, DMA_TX_RING_SZ);
+	}
+
+	if (dma_ring_space(priv->tx_head, priv->tx_tail, DMA_TX_RING_SZ) >
+	    TX_THRESH)
+		netif_wake_queue(priv->dev);
+}
+
+/**
+ * xgmac_tx_err:
+ * @priv: pointer to the private device structure
+ * Description: it cleans the descriptors and restarts the transmission
+ * in case of errors.
+ */
+static void xgmac_tx_err(struct xgmac_priv *priv)
+{
+	u32 reg, value, inten;
+
+	netif_stop_queue(priv->dev);
+
+	inten = readl(priv->base + XGMAC_DMA_INTR_ENA);
+	writel(0, priv->base + XGMAC_DMA_INTR_ENA);
+
+	reg = readl(priv->base + XGMAC_DMA_CONTROL);
+	writel(reg & ~DMA_CONTROL_ST, priv->base + XGMAC_DMA_CONTROL);
+	do {
+		value = readl(priv->base + XGMAC_DMA_STATUS) & 0x700000;
+	} while (value && (value != 0x600000));
+
+	xgmac_free_tx_skbufs(priv);
+	desc_init_tx_desc(priv->dma_tx, DMA_TX_RING_SZ);
+	priv->tx_tail = 0;
+	priv->tx_head = 0;
+	writel(reg | DMA_CONTROL_ST, priv->base + XGMAC_DMA_CONTROL);
+
+	writel(DMA_STATUS_TU | DMA_STATUS_TPS | DMA_STATUS_NIS | DMA_STATUS_AIS,
+		priv->base + XGMAC_DMA_STATUS);
+	writel(inten, priv->base + XGMAC_DMA_INTR_ENA);
+
+	netif_wake_queue(priv->dev);
+}
+
+static int xgmac_hw_init(struct net_device *dev)
+{
+	u32 value, ctrl;
+	int limit;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+
+	/* Save the ctrl register value */
+	ctrl = readl(ioaddr + XGMAC_CONTROL) & XGMAC_CONTROL_SPD_MASK;
+
+	/* SW reset */
+	value = DMA_BUS_MODE_SFT_RESET;
+	writel(value, ioaddr + XGMAC_DMA_BUS_MODE);
+	limit = 15000;
+	while (limit-- &&
+		(readl(ioaddr + XGMAC_DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET))
+		cpu_relax();
+	if (limit < 0)
+		return -EBUSY;
+
+	value = (0x10 << DMA_BUS_MODE_PBL_SHIFT) |
+		(0x10 << DMA_BUS_MODE_RPBL_SHIFT) |
+		DMA_BUS_MODE_FB | DMA_BUS_MODE_ATDS | DMA_BUS_MODE_AAL;
+	writel(value, ioaddr + XGMAC_DMA_BUS_MODE);
+
+	/* Enable interrupts */
+	writel(DMA_INTR_DEFAULT_MASK, ioaddr + XGMAC_DMA_STATUS);
+	writel(DMA_INTR_DEFAULT_MASK, ioaddr + XGMAC_DMA_INTR_ENA);
+
+	/* XGMAC requires AXI bus init. This is a 'magic number' for now */
+	writel(0x000100E, ioaddr + XGMAC_DMA_AXI_BUS);
+
+	ctrl |= XGMAC_CONTROL_DDIC | XGMAC_CONTROL_JE | XGMAC_CONTROL_ACS |
+		XGMAC_CONTROL_CAR;
+	if (dev->features & NETIF_F_RXCSUM)
+		ctrl |= XGMAC_CONTROL_IPC;
+	writel(ctrl, ioaddr + XGMAC_CONTROL);
+
+	value = DMA_CONTROL_DFF;
+	writel(value, ioaddr + XGMAC_DMA_CONTROL);
+
+	/* Set the HW DMA mode and the COE */
+	writel(XGMAC_OMR_TSF | XGMAC_OMR_RSF | XGMAC_OMR_RFD | XGMAC_OMR_RFA,
+		ioaddr + XGMAC_OMR);
+
+	/* Reset the MMC counters */
+	writel(1, ioaddr + XGMAC_MMC_CTRL);
+	return 0;
+}
+
+/**
+ *  xgmac_open - open entry point of the driver
+ *  @dev : pointer to the device structure.
+ *  Description:
+ *  This function is the open entry point of the driver.
+ *  Return value:
+ *  0 on success and an appropriate (-)ve integer as defined in errno.h
+ *  file on failure.
+ */
+static int xgmac_open(struct net_device *dev)
+{
+	int ret;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+
+	/* Check that the MAC address is valid.  If its not, refuse
+	 * to bring the device up. The user must specify an
+	 * address using the following linux command:
+	 *      ifconfig eth0 hw ether xx:xx:xx:xx:xx:xx  */
+	if (!is_valid_ether_addr(dev->dev_addr)) {
+		random_ether_addr(dev->dev_addr);
+		netdev_dbg(priv->dev, "generated random MAC address %pM\n",
+			dev->dev_addr);
+	}
+
+	skb_queue_head_init(&priv->rx_recycle);
+	memset(&priv->xstats, 0, sizeof(struct xgmac_extra_stats));
+
+	/* Initialize the XGMAC and descriptors */
+	xgmac_hw_init(dev);
+	xgmac_set_mac_addr(ioaddr, dev->dev_addr, 0);
+	xgmac_set_flow_ctrl(priv, priv->rx_pause, priv->tx_pause);
+
+	ret = xgmac_dma_desc_rings_init(dev);
+	if (ret < 0)
+		return ret;
+
+	/* Enable the MAC Rx/Tx */
+	xgmac_mac_enable(ioaddr);
+
+	napi_enable(&priv->napi);
+	netif_start_queue(dev);
+
+	enable_irq(dev->irq);
+
+	return 0;
+}
+
+/**
+ *  xgmac_release - close entry point of the driver
+ *  @dev : device pointer.
+ *  Description:
+ *  This is the stop entry point of the driver.
+ */
+static int xgmac_release(struct net_device *dev)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+
+	netif_stop_queue(dev);
+
+	disable_irq(dev->irq);
+	napi_disable(&priv->napi);
+	skb_queue_purge(&priv->rx_recycle);
+
+	/* Disable the MAC core */
+	xgmac_mac_disable(priv->base);
+
+	/* Release and free the Rx/Tx resources */
+	xgmac_free_dma_desc_rings(priv);
+
+	return 0;
+}
+
+/**
+ *  xgmac_xmit:
+ *  @skb : the socket buffer
+ *  @dev : device pointer
+ *  Description : Tx entry point of the driver.
+ */
+static netdev_tx_t xgmac_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	unsigned int entry;
+	int i;
+	int nfrags = skb_shinfo(skb)->nr_frags;
+	struct xgmac_dma_desc *desc, *first;
+	unsigned int desc_flags;
+	unsigned int len;
+	dma_addr_t paddr;
+
+	if (dma_ring_space(priv->tx_head, priv->tx_tail, DMA_TX_RING_SZ) <
+	    (nfrags + 1)) {
+		writel(DMA_INTR_DEFAULT_MASK | DMA_INTR_ENA_TIE,
+			priv->base + XGMAC_DMA_INTR_ENA);
+		netif_stop_queue(dev);
+		return NETDEV_TX_BUSY;
+	}
+
+	desc_flags = (skb->ip_summed == CHECKSUM_PARTIAL) ?
+		TXDESC_CSUM_ALL : 0;
+	entry = priv->tx_head;
+	desc = priv->dma_tx + entry;
+	first = desc;
+
+	priv->tx_skbuff[entry] = skb;
+	len = skb_headlen(skb);
+	paddr = dma_map_single(priv->device, skb->data, len, DMA_TO_DEVICE);
+	desc_set_buf_addr_and_size(desc, paddr, len);
+
+	for (i = 0; i < nfrags; i++) {
+		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+		len = frag->size;
+		entry = dma_ring_incr(entry, DMA_TX_RING_SZ);
+		desc = priv->dma_tx + entry;
+
+		paddr = dma_map_page(priv->device, frag->page.p,
+				frag->page_offset, len, DMA_TO_DEVICE);
+		priv->tx_skbuff[entry] = NULL;
+
+		desc_set_buf_addr_and_size(desc, paddr, len);
+		if (i < (nfrags - 1))
+			desc_set_tx_owner(desc, desc_flags);
+	}
+
+	/* Interrupt on completition only for the latest segment */
+	if (desc != first)
+		desc_set_tx_owner(desc, desc_flags |
+			TXDESC_LAST_SEG | TXDESC_INTERRUPT);
+	else
+		desc_flags |= TXDESC_LAST_SEG | TXDESC_INTERRUPT;
+
+	/* Set owner on first desc last to avoid race condition */
+	wmb();
+	desc_set_tx_owner(first, desc_flags | TXDESC_FIRST_SEG);
+
+	priv->tx_head = dma_ring_incr(entry, DMA_TX_RING_SZ);
+
+	writel(1, priv->base + XGMAC_DMA_TX_POLL);
+
+	return NETDEV_TX_OK;
+}
+
+static int xgmac_rx(struct xgmac_priv *priv, int limit)
+{
+	unsigned int entry;
+	unsigned int count = 0;
+	struct xgmac_dma_desc *p;
+
+	while (count < limit) {
+		int ip_checksum;
+		struct sk_buff *skb;
+		int frame_len;
+
+		writel(DMA_STATUS_RI | DMA_STATUS_NIS,
+		       priv->base + XGMAC_DMA_STATUS);
+
+		entry = priv->rx_tail;
+		p = priv->dma_rx + entry;
+		if (desc_get_owner(p))
+			break;
+
+		count++;
+		priv->rx_tail = dma_ring_incr(priv->rx_tail, DMA_RX_RING_SZ);
+
+		/* read the status of the incoming frame */
+		ip_checksum = desc_get_rx_status(priv, p);
+		if (ip_checksum < 0)
+			continue;
+
+		skb = priv->rx_skbuff[entry];
+		if (unlikely(!skb)) {
+			netdev_err(priv->dev, "Inconsistent Rx descriptor chain\n");
+			break;
+		}
+		priv->rx_skbuff[entry] = NULL;
+
+		frame_len = desc_get_rx_frame_len(p);
+		netdev_dbg(priv->dev, "RX frame size %d, COE status: %d\n",
+			frame_len, ip_checksum);
+
+		skb_put(skb, frame_len);
+		dma_unmap_single(priv->device, desc_get_buf_addr(p),
+				 frame_len, DMA_FROM_DEVICE);
+
+		skb->protocol = eth_type_trans(skb, priv->dev);
+		skb->ip_summed = ip_checksum;
+		if (ip_checksum == CHECKSUM_NONE)
+			netif_receive_skb(skb);
+		else
+			napi_gro_receive(&priv->napi, skb);
+	}
+
+	xgmac_rx_refill(priv);
+
+	writel(1, priv->base + XGMAC_DMA_RX_POLL);
+
+	return count;
+}
+
+/**
+ *  xgmac_poll - xgmac poll method (NAPI)
+ *  @napi : pointer to the napi structure.
+ *  @budget : maximum number of packets that the current CPU can receive from
+ *	      all interfaces.
+ *  Description :
+ *   This function implements the the reception process.
+ *   Also it runs the TX completion thread
+ */
+static int xgmac_poll(struct napi_struct *napi, int budget)
+{
+	struct xgmac_priv *priv = container_of(napi,
+				       struct xgmac_priv, napi);
+	int work_done = 0;
+
+	xgmac_tx_complete(priv);
+	work_done = xgmac_rx(priv, budget);
+
+	if (work_done < budget) {
+		napi_complete(napi);
+		writel(DMA_INTR_DEFAULT_MASK, priv->base + XGMAC_DMA_INTR_ENA);
+	}
+	return work_done;
+}
+
+/**
+ *  xgmac_tx_timeout
+ *  @dev : Pointer to net device structure
+ *  Description: this function is called when a packet transmission fails to
+ *   complete within a reasonable tmrate. The driver will mark the error in the
+ *   netdev structure and arrange for the device to be reset to a sane state
+ *   in order to transmit a new packet.
+ */
+static void xgmac_tx_timeout(struct net_device *dev)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+
+	/* Clear Tx resources and restart transmitting again */
+	xgmac_tx_err(priv);
+}
+
+/**
+ *  xgmac_set_rx_mode - entry point for multicast addressing
+ *  @dev : pointer to the device structure
+ *  Description:
+ *  This function is a driver entry point which gets called by the kernel
+ *  whenever multicast addresses must be enabled/disabled.
+ *  Return value:
+ *  void.
+ */
+static void xgmac_set_rx_mode(struct net_device *dev)
+{
+	int i;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+	unsigned int value = 0;
+	u32 mc_filter[XGMAC_NUM_HASH];
+	int reg = 1;
+	struct netdev_hw_addr *ha;
+	bool use_hash = false;
+
+	netdev_dbg(priv->dev, "# mcasts %d, # unicast %d\n",
+		 netdev_mc_count(dev), netdev_uc_count(dev));
+
+	if (dev->flags & IFF_PROMISC) {
+		writel(XGMAC_FRAME_FILTER_PR, ioaddr + XGMAC_FRAME_FILTER);
+		return;
+	}
+
+	memset(mc_filter, 0, sizeof(mc_filter));
+
+	if (netdev_uc_count(dev) > XGMAC_MAX_FILTER_ADDR) {
+		use_hash = true;
+		value |= XGMAC_FRAME_FILTER_HUC | XGMAC_FRAME_FILTER_HPF;
+	}
+	netdev_for_each_uc_addr(ha, dev) {
+		if (use_hash) {
+			u32 bit_nr = ~ether_crc(ETH_ALEN, ha->addr) >> 23;
+
+			/* The most significant 4 bits determine the register to
+			 * use (H/L) while the other 5 bits determine the bit
+			 * within the register. */
+			mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 31);
+		} else {
+			xgmac_set_mac_addr(ioaddr, ha->addr, reg);
+			reg++;
+		}
+	}
+
+	if (dev->flags & IFF_ALLMULTI) {
+		value |= XGMAC_FRAME_FILTER_PM;
+		goto out;
+	}
+
+	if ((netdev_mc_count(dev) + reg - 1) > XGMAC_MAX_FILTER_ADDR) {
+		use_hash = true;
+		value |= XGMAC_FRAME_FILTER_HMC | XGMAC_FRAME_FILTER_HPF;
+	}
+	netdev_for_each_mc_addr(ha, dev) {
+		if (use_hash) {
+			u32 bit_nr = ~ether_crc(ETH_ALEN, ha->addr) >> 23;
+
+			/* The most significant 4 bits determine the register to
+			 * use (H/L) while the other 5 bits determine the bit
+			 * within the register. */
+			mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 31);
+		} else {
+			xgmac_set_mac_addr(ioaddr, ha->addr, reg);
+			reg++;
+		}
+	}
+
+out:
+	for (i = 0; i < XGMAC_NUM_HASH; i++)
+		writel(mc_filter[i], ioaddr + XGMAC_HASH(i));
+
+	writel(value, ioaddr + XGMAC_FRAME_FILTER);
+}
+
+/**
+ *  xgmac_change_mtu - entry point to change MTU size for the device.
+ *  @dev : device pointer.
+ *  @new_mtu : the new MTU size for the device.
+ *  Description: the Maximum Transfer Unit (MTU) is used by the network layer
+ *  to drive packet transmission. Ethernet has an MTU of 1500 octets
+ *  (ETH_DATA_LEN). This value can be changed with ifconfig.
+ *  Return value:
+ *  0 on success and an appropriate (-)ve integer as defined in errno.h
+ *  file on failure.
+ */
+static int xgmac_change_mtu(struct net_device *dev, int new_mtu)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	int old_mtu;
+
+	if ((new_mtu < 46) || (new_mtu > MAX_MTU)) {
+		netdev_err(priv->dev, "invalid MTU, max MTU is: %d\n", MAX_MTU);
+		return -EINVAL;
+	}
+
+	old_mtu = dev->mtu;
+	dev->mtu = new_mtu;
+
+	/* return early if the buffer sizes will not change */
+	if (old_mtu <= ETH_DATA_LEN && new_mtu <= ETH_DATA_LEN)
+		return 0;
+	if (old_mtu == new_mtu)
+		return 0;
+
+	/* Stop everything, get ready to change the MTU */
+	if (!netif_running(dev))
+		return 0;
+
+	/* Bring the interface down and then back up */
+	xgmac_release(dev);
+	xgmac_open(dev);
+
+	return 0;
+}
+
+static irqreturn_t xgmac_pmt_interrupt(int irq, void *dev_id)
+{
+	u32 intr_status;
+	struct net_device *dev = (struct net_device *)dev_id;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+
+	intr_status = readl(ioaddr + XGMAC_INT_STAT);
+	if (intr_status & XGMAC_INT_STAT_PMT) {
+		netdev_dbg(priv->dev, "received Magic frame\n");
+		/* clear the PMT bits 5 and 6 by reading the PMT */
+		readl(ioaddr + XGMAC_PMT);
+	}
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t xgmac_interrupt(int irq, void *dev_id)
+{
+	u32 intr_status;
+	bool tx_err = false;
+	struct net_device *dev = (struct net_device *)dev_id;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	struct xgmac_extra_stats *x = &priv->xstats;
+
+	/* read the status register (CSR5) */
+	intr_status = readl(priv->base + XGMAC_DMA_STATUS);
+	intr_status &= readl(priv->base + XGMAC_DMA_INTR_ENA);
+	writel(intr_status, priv->base + XGMAC_DMA_STATUS);
+
+	/* It displays the DMA process states (CSR5 register) */
+	/* ABNORMAL interrupts */
+	if (unlikely(intr_status & DMA_STATUS_AIS)) {
+		if (intr_status & DMA_STATUS_UNF) {
+			netdev_err(priv->dev, "transmit underflow\n");
+			tx_err = true;
+			x->tx_undeflow_irq++;
+		}
+		if (intr_status & DMA_STATUS_TJT) {
+			netdev_err(priv->dev, "transmit jabber\n");
+			x->tx_jabber_irq++;
+		}
+		if (intr_status & DMA_STATUS_OVF) {
+			netdev_err(priv->dev, "recv overflow\n");
+			x->rx_overflow_irq++;
+		}
+		if (intr_status & DMA_STATUS_RU)
+			x->rx_buf_unav_irq++;
+		if (intr_status & DMA_STATUS_RPS) {
+			netdev_err(priv->dev, "receive process stopped\n");
+			x->rx_process_stopped_irq++;
+		}
+		if (intr_status & DMA_STATUS_RWT) {
+			netdev_err(priv->dev, "receive watchdog\n");
+			x->rx_watchdog_irq++;
+		}
+		if (intr_status & DMA_STATUS_ETI) {
+			netdev_err(priv->dev, "transmit early interrupt\n");
+			x->tx_early_irq++;
+		}
+		if (intr_status & DMA_STATUS_TPS) {
+			netdev_err(priv->dev, "transmit process stopped\n");
+			x->tx_process_stopped_irq++;
+			tx_err = true;
+		}
+		if (intr_status & DMA_STATUS_FBI) {
+			netdev_err(priv->dev, "fatal bus error\n");
+			x->fatal_bus_error_irq++;
+			tx_err = true;
+		}
+
+		if (tx_err)
+			xgmac_tx_err(priv);
+	}
+
+	/* TX/RX NORMAL interrupts */
+	if (intr_status & (DMA_STATUS_RI | DMA_STATUS_TU)) {
+		writel(DMA_INTR_ABNORMAL, priv->base + XGMAC_DMA_INTR_ENA);
+		napi_schedule(&priv->napi);
+	}
+
+	return IRQ_HANDLED;
+}
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/* Polling receive - used by NETCONSOLE and other diagnostic tools
+ * to allow network I/O with interrupts disabled. */
+static void xgmac_poll_controller(struct net_device *dev)
+{
+	disable_irq(dev->irq);
+	xgmac_interrupt(dev->irq, dev);
+	enable_irq(dev->irq);
+}
+#endif
+
+struct rtnl_link_stats64 *
+xgmac_get_stats64(struct net_device *dev,
+		       struct rtnl_link_stats64 *storage)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *base = priv->base;
+	u64 count;
+
+	storage->rx_packets = readl(base + XGMAC_MMC_RXFRAME_GB_LO);
+	storage->rx_packets |=
+		(u64)(readl(base + XGMAC_MMC_RXFRAME_GB_HI)) << 32;
+	storage->rx_bytes = readl(base + XGMAC_MMC_RXOCTET_G_LO);
+	storage->rx_bytes |= (u64)(readl(base + XGMAC_MMC_RXOCTET_G_HI)) << 32;
+
+	storage->multicast = readl(base + XGMAC_MMC_RXMCFRAME_G);
+	storage->rx_crc_errors = readl(base + XGMAC_MMC_RXCRCERR);
+	storage->rx_length_errors = readl(base + XGMAC_MMC_RXLENGTHERR);
+	storage->rx_missed_errors = readl(base + XGMAC_MMC_RXOVERFLOW);
+
+	storage->tx_packets = readl(base + XGMAC_MMC_TXFRAME_GB_LO);
+	storage->tx_packets |=
+		(u64)(readl(base + XGMAC_MMC_TXFRAME_GB_HI)) << 32;
+	storage->tx_bytes = readl(base + XGMAC_MMC_TXOCTET_G_LO);
+	storage->tx_bytes |= (u64)(readl(base + XGMAC_MMC_TXOCTET_G_HI)) << 32;
+
+	count = readl(base + XGMAC_MMC_TXFRAME_G_LO);
+	count |= (__u64)(readl(base + XGMAC_MMC_TXFRAME_G_HI)) << 32;
+	storage->tx_errors = storage->tx_packets - count;
+
+	storage->tx_fifo_errors = readl(base + XGMAC_MMC_TXUNDERFLOW);
+
+	return storage;
+}
+
+static int xgmac_set_mac_address(struct net_device *dev, void *p)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+	struct sockaddr *addr = p;
+
+	if (!is_valid_ether_addr(addr->sa_data))
+		return -EADDRNOTAVAIL;
+
+	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
+
+	xgmac_set_mac_addr(ioaddr, dev->dev_addr, 0);
+
+	return 0;
+}
+
+static int xgmac_set_features(struct net_device *dev, u32 features)
+{
+	u32 ctrl;
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void __iomem *ioaddr = priv->base;
+	u32 changed = dev->features ^ features;
+
+	if (!(changed & NETIF_F_RXCSUM))
+		return 0;
+
+	ctrl = readl(ioaddr + XGMAC_CONTROL);
+	if (features & NETIF_F_RXCSUM)
+		ctrl |= XGMAC_CONTROL_IPC;
+	else
+		ctrl &= ~XGMAC_CONTROL_IPC;
+	writel(ctrl, ioaddr + XGMAC_CONTROL);
+
+	return 0;
+}
+
+static const struct net_device_ops xgmac_netdev_ops = {
+	.ndo_open = xgmac_open,
+	.ndo_start_xmit = xgmac_xmit,
+	.ndo_stop = xgmac_release,
+	.ndo_change_mtu = xgmac_change_mtu,
+	.ndo_set_rx_mode = xgmac_set_rx_mode,
+	.ndo_tx_timeout = xgmac_tx_timeout,
+	.ndo_get_stats64 = xgmac_get_stats64,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	.ndo_poll_controller = xgmac_poll_controller,
+#endif
+	.ndo_set_mac_address = xgmac_set_mac_address,
+	.ndo_set_features = xgmac_set_features,
+};
+
+static int xgmac_ethtool_getsettings(struct net_device *dev,
+					  struct ethtool_cmd *cmd)
+{
+	cmd->autoneg = 0;
+	cmd->duplex = DUPLEX_FULL;
+	ethtool_cmd_speed_set(cmd, 10000);
+	cmd->supported = SUPPORTED_10000baseT_Full;
+	cmd->advertising = 0;
+	cmd->transceiver = XCVR_INTERNAL;
+	return 0;
+}
+
+static void xgmac_get_pauseparam(struct net_device *netdev,
+				      struct ethtool_pauseparam *pause)
+{
+	struct xgmac_priv *priv = netdev_priv(netdev);
+
+	pause->rx_pause = priv->rx_pause;
+	pause->tx_pause = priv->tx_pause;
+}
+
+static int xgmac_set_pauseparam(struct net_device *netdev,
+				     struct ethtool_pauseparam *pause)
+{
+	struct xgmac_priv *priv = netdev_priv(netdev);
+	return xgmac_set_flow_ctrl(priv, pause->rx_pause, pause->tx_pause);
+}
+
+struct xgmac_stats {
+	char stat_string[ETH_GSTRING_LEN];
+	int stat_offset;
+	bool is_reg;
+};
+
+#define XGMAC_STAT(m)	\
+	{ #m, offsetof(struct xgmac_priv, xstats.m), false }
+#define XGMAC_HW_STAT(m, reg_offset)	\
+	{ #m, reg_offset, true }
+
+static const struct xgmac_stats xgmac_gstrings_stats[] = {
+	XGMAC_STAT(tx_frame_flushed),
+	XGMAC_STAT(tx_payload_error),
+	XGMAC_STAT(tx_ip_header_error),
+	XGMAC_STAT(tx_local_fault),
+	XGMAC_STAT(tx_remote_fault),
+	XGMAC_STAT(rx_watchdog),
+	XGMAC_STAT(da_rx_filter_fail),
+	XGMAC_STAT(sa_rx_filter_fail),
+	XGMAC_STAT(rx_missed_cntr),
+	XGMAC_STAT(rx_overflow_cntr),
+	XGMAC_STAT(tx_undeflow_irq),
+	XGMAC_STAT(tx_process_stopped_irq),
+	XGMAC_STAT(tx_jabber_irq),
+	XGMAC_STAT(rx_overflow_irq),
+	XGMAC_STAT(rx_buf_unav_irq),
+	XGMAC_STAT(rx_process_stopped_irq),
+	XGMAC_STAT(rx_watchdog_irq),
+	XGMAC_STAT(rx_payload_error),
+	XGMAC_STAT(rx_ip_header_error),
+	XGMAC_STAT(tx_early_irq),
+	XGMAC_STAT(fatal_bus_error_irq),
+	XGMAC_HW_STAT(tx_vlan, XGMAC_MMC_TXVLANFRAME),
+	XGMAC_HW_STAT(rx_vlan, XGMAC_MMC_RXVLANFRAME),
+	XGMAC_HW_STAT(tx_pause, XGMAC_MMC_TXPAUSEFRAME),
+	XGMAC_HW_STAT(rx_pause, XGMAC_MMC_RXPAUSEFRAME),
+};
+#define XGMAC_STATS_LEN ARRAY_SIZE(xgmac_gstrings_stats)
+
+static void xgmac_get_ethtool_stats(struct net_device *dev,
+					 struct ethtool_stats *dummy,
+					 u64 *data)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	void *p = priv;
+	int i;
+
+	for (i = 0; i < XGMAC_STATS_LEN; i++) {
+		if (xgmac_gstrings_stats[i].is_reg)
+			*data++ = readl(priv->base +
+				xgmac_gstrings_stats[i].stat_offset);
+		else
+			*data++ = *(u32 *)(p +
+				xgmac_gstrings_stats[i].stat_offset);
+	}
+}
+
+static int xgmac_get_sset_count(struct net_device *netdev, int sset)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		return XGMAC_STATS_LEN;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static void xgmac_get_strings(struct net_device *dev, u32 stringset,
+				   u8 *data)
+{
+	int i;
+	u8 *p = data;
+
+	switch (stringset) {
+	case ETH_SS_STATS:
+		for (i = 0; i < XGMAC_STATS_LEN; i++) {
+			memcpy(p, xgmac_gstrings_stats[i].stat_string,
+			       ETH_GSTRING_LEN);
+			p += ETH_GSTRING_LEN;
+		}
+		break;
+	default:
+		WARN_ON(1);
+		break;
+	}
+}
+
+static void xgmac_get_wol(struct net_device *dev,
+			       struct ethtool_wolinfo *wol)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+
+	if (device_can_wakeup(priv->device)) {
+		wol->supported = WAKE_MAGIC | WAKE_UCAST;
+		wol->wolopts = priv->wolopts;
+	}
+}
+
+static int xgmac_set_wol(struct net_device *dev,
+			      struct ethtool_wolinfo *wol)
+{
+	struct xgmac_priv *priv = netdev_priv(dev);
+	u32 support = WAKE_MAGIC | WAKE_UCAST;
+
+	if (!device_can_wakeup(priv->device))
+		return -EINVAL;
+
+	if (wol->wolopts & ~support)
+		return -EINVAL;
+
+	priv->wolopts = wol->wolopts;
+
+	if (wol->wolopts) {
+		device_set_wakeup_enable(priv->device, 1);
+		enable_irq_wake(dev->irq);
+	} else {
+		device_set_wakeup_enable(priv->device, 0);
+		disable_irq_wake(dev->irq);
+	}
+
+	return 0;
+}
+
+static struct ethtool_ops xgmac_ethtool_ops = {
+	.get_settings = xgmac_ethtool_getsettings,
+	.get_link = ethtool_op_get_link,
+	.get_pauseparam = xgmac_get_pauseparam,
+	.set_pauseparam = xgmac_set_pauseparam,
+	.get_ethtool_stats = xgmac_get_ethtool_stats,
+	.get_strings = xgmac_get_strings,
+	.get_wol = xgmac_get_wol,
+	.set_wol = xgmac_set_wol,
+	.get_sset_count = xgmac_get_sset_count,
+};
+
+/**
+ * xgmac_probe
+ * @pdev: platform device pointer
+ * Description: the driver is initialized through platform_device.
+ */
+static int xgmac_probe(struct platform_device *pdev)
+{
+	int ret = 0;
+	struct resource *res;
+	struct net_device *ndev = NULL;
+	struct xgmac_priv *priv = NULL;
+	u32 uid;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res)
+		return -ENODEV;
+
+	if (!request_mem_region(res->start, resource_size(res), pdev->name))
+		return -EBUSY;
+
+	ndev = alloc_etherdev(sizeof(struct xgmac_priv));
+	if (!ndev) {
+		ret = -ENOMEM;
+		goto err_alloc;
+	}
+
+	SET_NETDEV_DEV(ndev, &pdev->dev);
+	priv = netdev_priv(ndev);
+	platform_set_drvdata(pdev, ndev);
+	ether_setup(ndev);
+	ndev->netdev_ops = &xgmac_netdev_ops;
+	SET_ETHTOOL_OPS(ndev, &xgmac_ethtool_ops);
+
+	priv->device = &pdev->dev;
+	priv->dev = ndev;
+	priv->rx_pause = 1;
+	priv->tx_pause = 1;
+
+	priv->base = ioremap(res->start, resource_size(res));
+	if (!priv->base) {
+		netdev_err(ndev, "ioremap failed\n");
+		ret = -ENOMEM;
+		goto err_io;
+	}
+
+	uid = readl(priv->base + XGMAC_VERSION);
+	netdev_info(ndev, "h/w version is 0x%x\n", uid);
+
+	ndev->irq = platform_get_irq(pdev, 0);
+	if (ndev->irq == -ENXIO) {
+		netdev_err(ndev, "No irq resource\n");
+		ret = ndev->irq;
+		goto err_irq;
+	}
+
+	ret = request_irq(ndev->irq, xgmac_interrupt, 0,
+			  dev_name(&pdev->dev), ndev);
+	if (ret < 0) {
+		netdev_err(ndev, "Could not request irq %d - ret %d)\n",
+			ndev->irq, ret);
+		goto err_irq;
+	}
+	disable_irq(ndev->irq);
+
+	priv->pmt_irq = platform_get_irq(pdev, 1);
+	if (priv->pmt_irq == -ENXIO) {
+		netdev_err(ndev, "No pmt irq resource\n");
+		ret = priv->pmt_irq;
+		goto err_pmt_irq;
+	}
+
+	ret = request_irq(priv->pmt_irq, xgmac_pmt_interrupt, 0,
+			  dev_name(&pdev->dev), ndev);
+	if (ret < 0) {
+		netdev_err(ndev, "Could not request irq %d - ret %d)\n",
+			priv->pmt_irq, ret);
+		goto err_pmt_irq;
+	}
+
+	device_set_wakeup_capable(&pdev->dev, 1);
+	if (device_can_wakeup(priv->device))
+		priv->wolopts = WAKE_MAGIC;	/* Magic Frame as default */
+
+	ndev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HIGHDMA;
+	if (readl(priv->base + XGMAC_DMA_HW_FEATURE) & DMA_HW_FEAT_TXCOESEL)
+		ndev->hw_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
+				     NETIF_F_RXCSUM;
+	ndev->features |= ndev->hw_features;
+	ndev->priv_flags |= IFF_UNICAST_FLT;
+
+	/* Get the MAC address */
+	xgmac_get_mac_addr(priv->base, ndev->dev_addr, 0);
+	if (!is_valid_ether_addr(ndev->dev_addr))
+		netdev_warn(ndev, "MAC address %pM not valid",
+			 ndev->dev_addr);
+
+	netif_napi_add(ndev, &priv->napi, xgmac_poll, 64);
+	ret = register_netdev(ndev);
+	if (ret)
+		goto err_reg;
+
+	return 0;
+
+err_reg:
+	free_irq(priv->pmt_irq, ndev);
+err_pmt_irq:
+	free_irq(ndev->irq, ndev);
+err_irq:
+	iounmap(priv->base);
+err_io:
+	free_netdev(ndev);
+err_alloc:
+	release_mem_region(res->start, resource_size(res));
+	platform_set_drvdata(pdev, NULL);
+	return ret;
+}
+
+/**
+ * xgmac_dvr_remove
+ * @pdev: platform device pointer
+ * Description: this function resets the TX/RX processes, disables the MAC RX/TX
+ * changes the link status, releases the DMA descriptor rings,
+ * unregisters the MDIO bus and unmaps the allocated memory.
+ */
+static int xgmac_remove(struct platform_device *pdev)
+{
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct xgmac_priv *priv = netdev_priv(ndev);
+	struct resource *res;
+
+	xgmac_mac_disable(priv->base);
+
+	/* Free the IRQ lines */
+	free_irq(ndev->irq, ndev);
+	free_irq(priv->pmt_irq, ndev);
+
+	platform_set_drvdata(pdev, NULL);
+	unregister_netdev(ndev);
+
+	iounmap(priv->base);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	release_mem_region(res->start, resource_size(res));
+
+	free_netdev(ndev);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM_SLEEP
+static void xgmac_pmt(void __iomem *ioaddr, unsigned long mode)
+{
+	unsigned int pmt = 0;
+
+	if (mode & WAKE_MAGIC)
+		pmt |= XGMAC_PMT_POWERDOWN | XGMAC_PMT_MAGIC_PKT;
+	if (mode & WAKE_UCAST)
+		pmt |= XGMAC_PMT_POWERDOWN | XGMAC_PMT_GLBL_UNICAST;
+
+	writel(pmt, ioaddr + XGMAC_PMT);
+}
+
+static int xgmac_suspend(struct device *dev)
+{
+	struct net_device *ndev = platform_get_drvdata(to_platform_device(dev));
+	struct xgmac_priv *priv = netdev_priv(ndev);
+	u32 value;
+
+	if (!ndev || !netif_running(ndev))
+		return 0;
+
+	netif_device_detach(ndev);
+	napi_disable(&priv->napi);
+	writel(0, priv->base + XGMAC_DMA_INTR_ENA);
+
+	if (device_may_wakeup(priv->device)) {
+		/* Stop TX/RX DMA Only */
+		value = readl(priv->base + XGMAC_DMA_CONTROL);
+		value &= ~(DMA_CONTROL_ST | DMA_CONTROL_SR);
+		writel(value, priv->base + XGMAC_DMA_CONTROL);
+
+		xgmac_pmt(priv->base, priv->wolopts);
+	} else
+		xgmac_mac_disable(priv->base);
+
+	return 0;
+}
+
+static int xgmac_resume(struct device *dev)
+{
+	struct net_device *ndev = platform_get_drvdata(to_platform_device(dev));
+	struct xgmac_priv *priv = netdev_priv(ndev);
+	void __iomem *ioaddr = priv->base;
+
+	if (!netif_running(ndev))
+		return 0;
+
+	xgmac_pmt(ioaddr, 0);
+
+	/* Enable the MAC and DMA */
+	xgmac_mac_enable(ioaddr);
+	writel(DMA_INTR_DEFAULT_MASK, ioaddr + XGMAC_DMA_STATUS);
+	writel(DMA_INTR_DEFAULT_MASK, ioaddr + XGMAC_DMA_INTR_ENA);
+
+	netif_device_attach(ndev);
+	napi_enable(&priv->napi);
+
+	return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(xgmac_pm_ops, xgmac_suspend, xgmac_resume);
+#define XGMAC_PM_OPS (&xgmac_pm_ops)
+#else
+#define XGMAC_PM_OPS NULL
+#endif /* CONFIG_PM_SLEEP */
+
+static const struct of_device_id xgmac_of_match[] = {
+	{ .compatible = "calxeda,hb-xgmac", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, xgmac_of_match);
+
+static struct platform_driver xgmac_driver = {
+	.driver = {
+		.name = "calxedaxgmac",
+		.of_match_table = xgmac_of_match,
+	},
+	.probe = xgmac_probe,
+	.remove = xgmac_remove,
+	.driver.pm = XGMAC_PM_OPS,
+};
+
+/**
+ * xgmac_init - Entry point for the driver
+ * Description: This function is the entry point for the driver.
+ */
+static int __init xgmac_init(void)
+{
+	return platform_driver_register(&xgmac_driver);
+}
+module_init(xgmac_init);
+
+/**
+ * xgmac_cleanup_module - Cleanup routine for the driver
+ * Description: This function is the cleanup routine for the driver.
+ */
+static void __exit xgmac_exit(void)
+{
+	platform_driver_unregister(&xgmac_driver);
+}
+module_exit(xgmac_exit);
+
+MODULE_AUTHOR("Calxeda, Inc.");
+MODULE_DESCRIPTION("Calxeda 10G XGMAC driver");
+MODULE_LICENSE("GPL v2");
-- 
1.7.5.4

^ permalink raw reply related

* Storage Quota Almost Full
From: High Sierra @ 2011-10-30 21:32 UTC (permalink / raw)


 
 
To ensure quick, responsive e-mail services, it is necessary to establish limits on the amount of e-mail each user may store on the system. 
Our records show that you have almost exhausted your usage allowance provided with your webmail service.
Depending on your current storage space you may request for additional storage.
Please click here <wlmailhtml:{61977E30-E4AC-48F9-94AE-09C579191763}mid://00001770/!x-usc:http://pastehtml.com/view/bbvepdyhi.html>  to request for additional storage. 
 
Thanks
Wenjun Zhou
For IT Support Center

^ permalink raw reply

* (unknown), 
From: Mrs Mellisa Lewis. @ 2011-10-30 23:21 UTC (permalink / raw)




Contact My Lawyer For More Details,!! Barr jay mchenry for  
$14,258,000.00 tell him that i have will this money to  
you.Ref:(JJ/MMS/953/5015/GwrI/316us/uk For charity organization in  
your country.Email:(bjmfirm@fengv.com) Tel: +44703 183 9543,God Bless  
You Mrs Mellisa Lewis.

^ permalink raw reply

* For Claims Contact: Mrs.Hope Spencer. Email:mrs.hopespencer@microsoftdonation.pcriot.com
From: Microsoft Donation @ 2011-10-31  7:30 UTC (permalink / raw)


Dear Internet User! Microsoft Company have donated £1,000,000 GBP to you and
your family has a benefit for joining us in expanding the use of microsoft
windows and internet in your country

^ permalink raw reply

* When can a net device get its setting correctly ?
From: WeipingPan @ 2011-10-31  2:53 UTC (permalink / raw)
  To: open list:NETWORKING [GENERAL]

Hi, all,

BUG DESCRIPTION:
Zheng Liang(lzheng@redhat.com) found a problem that if we config bonding 
with arp monitor,
and enslave 10G cards, bonding driver cannot get the speed and duplex 
from them,
it will assume to be 100Mb/sec and Full.

I test kernel upstream, commit ec7ae517537a(Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6),
it also has this problem.
And not only 10G cards have this problem, when I use 1Gb(igb), the 
problem is the same.


[root@dell-p390n-01 ~]# uname -a
Linux dell-p390n-01.lab.bos.redhat.com 3.1.0+ #1 SMP Fri Oct 28 23:38:59 
EDT 2011 i686 i686 i386 GNU/Linux

[root@dell-p390n-01 ~]# dmesg |grep p4p1
udev: renamed network interface eth0 to p4p1
ADDRCONF(NETDEV_UP): p4p1: link is not ready
igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
ADDRCONF(NETDEV_CHANGE): p4p1: link becomes ready

[root@dell-p390n-01 ~]# ethtool p4p1
Settings for p4p1:
         Supported ports: [ TP ]
         Supported link modes:   10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Supports auto-negotiation: Yes
         Advertised link modes:  10baseT/Half 10baseT/Full
                                 100baseT/Half 100baseT/Full
                                 1000baseT/Full
         Advertised pause frame use: No
         Advertised auto-negotiation: Yes
         Speed: 1000Mb/s
         Duplex: Full
         Port: Twisted Pair
         PHYAD: 1
         Transceiver: internal
         Auto-negotiation: on
         MDI-X: Unknown
         Supports Wake-on: pumbg
         Wake-on: d
         Current message level: 0x00000003 (3)
         Link detected: yes

[root@dell-p390n-01 ~]# modprobe bonding mode=1 arp_interval=100 
arp_ip_target=10.66.12.130
[root@dell-p390n-01 ~]# ifconfig bond0 up
[root@dell-p390n-01 ~]# ifenslave bond0 p4p1

[root@dell-p390n-01 ~]# dmesg
bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
bonding: ARP monitoring set to 100 ms, validate none, with 1 target(s):
bonding:  10.66.12.130
bonding:
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Warning: failed to get speed and duplex from p4p1, 
assumed to be 100Mb/sec and Full.<-----bug
bonding: bond0: making interface p4p1 the new active one.
bonding: bond0: first active interface up!
bonding: bond0: enslaving p4p1 as an active interface with an up link.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
bonding: bond0: link status definitely down for interface p4p1, disabling it
bonding: bond0: now running without any active interface !
igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX


[root@dell-p390n-01 ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: None
MII Status: down
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 100
ARP IP target/s (n.n.n.n form): 10.66.12.130

Slave Interface: p4p1
MII Status: down
Speed: 100 Mbps <------ bug
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:1b:21:66:d8:a0
Slave queue ID: 0


But there is no such problem when use miimon.

[root@dell-p390n-01 ~]# modprobe bonding mode=1 miimon=100
[root@dell-p390n-01 ~]# ifconfig bond0 up
[root@dell-p390n-01 ~]# ifenslave bond0 p4p1
[root@dell-p390n-01 ~]# dmesg
bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
bonding: MII link monitoring set to 100 ms
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: enslaving p4p1 as a backup interface with a down link.
igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
bonding: bond0: link status definitely up for interface p4p1, 1000 Mbps 
full duplex.
bonding: bond0: making interface p4p1 the new active one.
bonding: bond0: first active interface up!
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready


BUG ANALYSIS:
First, when uses arp monitor, the call trace is:
1485 int bond_enslave(struct net_device *bond_dev, struct net_device
*slave_dev)
1652         res = dev_open(slave_dev);
1761         if (bond_update_speed_duplex(new_slave) &&

And when calling bond_update_speed_duplex(), this message, "igb: p4p1 
NIC Link
is Up 1000 Mbps Full Duplex, Flow Control: RX", doesn't show up.
So I think even we call dev_open(), but the device is not ready to get its
setting.

Second, when uses miimon, the call trace is:
1485 int bond_enslave(struct net_device *bond_dev, struct net_device
*slave_dev)
1652         res = dev_open(slave_dev);

2419 static void bond_miimon_commit(struct bonding *bond)
2444                         bond_update_speed_duplex(slave);

And when calling bond_update_speed_duplex(), it gets correct setting.

QUESTION:
When can a net device get its setting correctly ?
Maybe dev_open() is not enough.

thanks
Weiping Pan

^ permalink raw reply

* Re: [PATCH v2] net: add calxeda xgmac ethernet driver
From: saeed bishara @ 2011-10-31  7:58 UTC (permalink / raw)
  To: Rob Herring, netdev, devicetree-discuss; +Cc: joe
In-Reply-To: <1320010217-17322-1-git-send-email-robherring2@gmail.com>

On Sun, Oct 30, 2011 at 11:30 PM, Rob Herring <robherring2@gmail.com> wrote:
> From: Rob Herring <rob.herring@calxeda.com>
Hi Rob, one more comment
>


> +static void xgmac_tx_complete(struct xgmac_priv *priv)
> +{
> +       void __iomem *ioaddr = priv->base;
> +
> +
> +       writel(DMA_STATUS_TU | DMA_STATUS_NIS, ioaddr + XGMAC_DMA_STATUS);
> +
> +       while (dma_ring_cnt(priv->tx_head, priv->tx_tail, DMA_TX_RING_SZ)) {
> +               unsigned int entry = priv->tx_tail;
> +               struct sk_buff *skb = priv->tx_skbuff[entry];
> +               struct xgmac_dma_desc *p = priv->dma_tx + entry;
> +
> +               /* Check if the descriptor is owned by the DMA. */
> +               if (desc_get_owner(p))
> +                       break;
> +
> +               /* Verify tx error by looking at the last segment */
> +               if (desc_get_tx_ls(p))
> +                       desc_get_tx_status(priv, p);
> +
> +               netdev_dbg(priv->dev, "tx ring: curr %d, dirty %d\n",
> +                       priv->tx_head, priv->tx_tail);
> +
> +               dma_unmap_single(priv->device, desc_get_buf_addr(p),
> +                                desc_get_buf_len(p), DMA_TO_DEVICE);
I think you missed to do dma_unmap_page for the skp frags page.
saeed

^ permalink raw reply

* Winning Code (AQ11WWRZZA1)!!!
From: British Cordinator @ 2011-10-31  8:40 UTC (permalink / raw)


This is to inform you that you have been
selected for a cash prize of 1,000,000
(British Pounds) held on the 28th of october,
2011 in London (United Kingdom).

1.Full Name
2.Full Address
3.Marital Status
4.Occupation
5.Age
6.Sex
7.Country Of Residence
8.Telephone Number
9.Next of Kin

Agent Name: Mr.David Scott
Tel: +44 702-403-4509
Email: (bnl.claimsdept121@live.co.uk)

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2011-10-31  8:40 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


The majority of the bits here are driver bug fixes, but more notably:

1) icmp6_dst_alloc needs to set the destination address in the route
   before trying to binding the route to an inetpeer entry, since the
   inetpeer is found by destination.  Fix from Gao Feng.

2) Traffic class not set properly for TIME_WAIT sockets, from Eric
   Dumazet.

3) Fix vlan over bonding ARP regression, also from Eric Dumazet.

4) ip6_ufo_append_data() does not propagate errors properly, resulting
   in signal interrupts and hangups looking like memory allocation
   errors.  Fix from Zheng Yan.

5) Refcounting and hash lookup fixes in batman-adv from Simon Wunderlich.

7) Fix races in bond_close() and workqueue deadlocks.  From Jay
   Vosburgh.

8) IPV6 addrconf prefix handling needs to explicitly lookup routes
   in the RT6_TABLE_PREFIX routing table, otherwise it might find
   unrelated routes.  Fix from Andreas Hofmeister.

Please pull, thanks a lot!

The following changes since commit 839d8810747bbf39e0a5a7f223b67bffa7945f8d:

  Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging (2011-10-30 15:54:59 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

Andreas Hofmeister (1):
      ipv6: fix route lookup in addrconf_prefix_rcv()

Angus Clark (1):
      stmmac: fix NULL pointer dereference in capabilities fixup (v2)

Antonio Quartulli (1):
      batman-adv: unify hash_entry field position in tt_local/global_entry

David S. Miller (1):
      Merge branch 'batman-adv/maint' of git://git.open-mesh.org/linux-merge

Dmitry Kravkov (2):
      bnx2x: use FW 7.0.29.0
      bnx2x: update driver version to 1.70.30-0

Eric Dumazet (2):
      ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAIT
      vlan: allow nested vlan_do_receive()

Gao feng (1):
      ipv6: fix route error binding peer in func icmp6_dst_alloc

Geert Uytterhoeven (1):
      i825xx: Fix incorrect dependency for BVME6000_NET

Giuseppe CAVALLARO (2):
      stmmac: fix a bug while checking the HW cap reg (v2)
      stmmac: update normal descriptor structure (v2)

Jay Vosburgh (1):
      bonding: eliminate bond_close race conditions

Simon Wunderlich (2):
      batman-adv: remove references for global tt entries
      batman-adv: add sanity check when removing global tts

Somnath Kotur (2):
      be2net: Refactored be_cmds.c file.
      be2net: Changing MAC Address of a VF was broken.

Sony Chacko (1):
      qlcnic: updated reset sequence

Sritej Velaga (2):
      qlcnic: skip IDC ack check in fw reset path.
      qlcnic: Updated License file

Sucheta Chakraborty (2):
      qlcnic: reset loopback mode if promiscous mode setting fails.
      qlcnic: fix beacon and LED test.

Yaniv Rosner (5):
      bnx2x: Fix LED blink rate for 578xx
      bnx2x: Add link retry to 578xx-KR
      bnx2x: Fix RX/TX problem caused by the MAC layer
      bnx2x: Fix 54618se LED behavior
      bnx2x: Enable changing speed when port type is PORT_DA

Zheng Yan (1):
      ipv6: fix error propagation in ip6_ufo_append_data()

 Documentation/networking/LICENSE.qlcnic            |   51 +---
 drivers/net/bonding/bond_3ad.c                     |    8 +-
 drivers/net/bonding/bond_alb.c                     |   16 +-
 drivers/net/bonding/bond_main.c                    |   96 +++---
 drivers/net/bonding/bonding.h                      |    1 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h        |    4 +-
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c    |    1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_hsi.h    |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c   |  217 ++++++++---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.h   |    3 +
 drivers/net/ethernet/emulex/benet/be_cmds.c        |  400 ++++++--------------
 drivers/net/ethernet/emulex/benet/be_main.c        |   28 +-
 drivers/net/ethernet/i825xx/Kconfig                |    2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h        |    4 +-
 .../net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c    |   45 ++-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hdr.h    |    2 +
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c     |    2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_init.c   |   50 +++-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c   |   48 ++-
 drivers/net/ethernet/stmicro/stmmac/common.h       |    8 +-
 drivers/net/ethernet/stmicro/stmmac/descs.h        |   31 +-
 drivers/net/ethernet/stmicro/stmmac/norm_desc.c    |   38 +-
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |    8 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |   13 +-
 include/linux/if_vlan.h                            |    6 +-
 include/net/inet_timewait_sock.h                   |    1 +
 include/net/ipv6.h                                 |    3 +-
 net/8021q/vlan_core.c                              |    7 +-
 net/batman-adv/translation-table.c                 |   17 +-
 net/batman-adv/types.h                             |    4 +-
 net/core/dev.c                                     |    4 +-
 net/dccp/ipv6.c                                    |    4 +-
 net/ipv4/tcp_minisocks.c                           |    1 +
 net/ipv6/addrconf.c                                |   43 ++-
 net/ipv6/inet6_connection_sock.c                   |    2 +-
 net/ipv6/ip6_output.c                              |    9 +-
 net/ipv6/route.c                                   |    3 +-
 net/ipv6/tcp_ipv6.c                                |   17 +-
 net/sctp/ipv6.c                                    |    2 +-
 39 files changed, 634 insertions(+), 567 deletions(-)

^ permalink raw reply

* [PATCH 2/2 v4] net/smsc911x: Add regulator support
From: Robert Marklund @ 2011-10-31 12:38 UTC (permalink / raw)
  To: netdev, Steve Glendinning
  Cc: Mathieu Poirier, Robert Marklund, Paul Mundt, linux-sh,
	Sascha Hauer, Tony Lindgren, linux-omap, Mike Frysinger,
	uclinux-dist-devel, Linus Walleij

Add some basic regulator support for the power pins, as needed
by the ST-Ericsson Snowball platform that powers up the SMSC911
chip using an external regulator.

Platforms that use regulators and the smsc911x and have no defined
regulator for the smsc911x and claim complete regulator
constraints with no dummy regulators will need to provide it, for
example using a fixed voltage regulator. It appears that this may
affect (apart from Ux500 Snowball) possibly these archs/machines
that from some grep:s appear to define both CONFIG_SMSC911X and
CONFIG_REGULATOR:

- ARM Freescale mx3 and OMAP 2 plus, Raumfeld machines
- Blackfin
- Super-H

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: linux-omap@vger.kernel.org
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Robert Marklund <robert.marklund@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v3->v4:
- Remove dual prints and old comment on Mike's request.
- Split the request_free fucntion on Mike and Sascha request.
ChangeLog v2->v3:
- Use bulk regulators on Mark's request.
- Add Cc-fileds to some possibly affected platforms.
ChangeLog v1->v2:
- Don't check for NULL regulators and error out properly if the
  regulators can't be found. All platforms using the smsc911x
  and the regulator framework simultaneously need to provide some
  kind of regulator for it.
---
 drivers/net/ethernet/smsc/smsc911x.c |  103 ++++++++++++++++++++++++++++++----
 1 files changed, 92 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index 8843071..9a2e792 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -44,6 +44,7 @@
 #include <linux/module.h>
 #include <linux/netdevice.h>
 #include <linux/platform_device.h>
+#include <linux/regulator/consumer.h>
 #include <linux/sched.h>
 #include <linux/timer.h>
 #include <linux/bug.h>
@@ -88,6 +89,8 @@ struct smsc911x_ops {
 				unsigned int *buf, unsigned int wordcount);
 };
 
+#define SMSC911X_NUM_SUPPLIES 2
+
 struct smsc911x_data {
 	void __iomem *ioaddr;
 
@@ -138,6 +141,9 @@ struct smsc911x_data {
 
 	/* register access functions */
 	const struct smsc911x_ops *ops;
+
+	/* regulators */
+	struct regulator_bulk_data supplies[SMSC911X_NUM_SUPPLIES];
 };
 
 /* Easy access to information */
@@ -362,6 +368,68 @@ out:
 	spin_unlock_irqrestore(&pdata->dev_lock, flags);
 }
 
+/*
+ * Enable or disable resources, currently just regulators.
+ */
+static int smsc911x_enable_disable_resources(struct platform_device *pdev,
+					     bool enable)
+{
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct smsc911x_data *pdata = netdev_priv(ndev);
+	int ret = 0;
+
+	/* enable/disable regulators */
+	if (enable) {
+		ret = regulator_bulk_enable(ARRAY_SIZE(pdata->supplies),
+				pdata->supplies);
+		if (ret)
+			netdev_err(ndev, "failed to enable regulators %d\n",
+					ret);
+	} else
+		ret = regulator_bulk_disable(ARRAY_SIZE(pdata->supplies),
+				pdata->supplies);
+	return ret;
+}
+
+/*
+ * Request resources, currently just regulators.
+ *
+ * The SMSC911x has two power pins: vddvario and vdd33a, in designs where
+ * these are not always-on we need to request regulators to be turned on
+ * before we can try to access the device registers.
+ */
+static int smsc911x_request_resources(struct platform_device *pdev)
+{
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct smsc911x_data *pdata = netdev_priv(ndev);
+	int ret = 0;
+
+	/* Request regulators */
+	pdata->supplies[0].supply = "vdd33a";
+	pdata->supplies[1].supply = "vddvario";
+	ret = regulator_bulk_get(&pdev->dev,
+			ARRAY_SIZE(pdata->supplies),
+			pdata->supplies);
+	if (ret)
+		netdev_err(ndev, "couldn't get regulators %d\n",
+				ret);
+	return ret;
+}
+
+/*
+ * Free resources, currently just regulators.
+ *
+ */
+static void smsc911x_free_resources(struct platform_device *pdev)
+{
+	struct net_device *ndev = platform_get_drvdata(pdev);
+	struct smsc911x_data *pdata = netdev_priv(ndev);
+
+	/* Free regulators */
+	regulator_bulk_free(ARRAY_SIZE(pdata->supplies),
+			pdata->supplies);
+}
+
 /* waits for MAC not busy, with timeout.  Only called by smsc911x_mac_read
  * and smsc911x_mac_write, so assumes mac_lock is held */
 static int smsc911x_mac_complete(struct smsc911x_data *pdata)
@@ -2092,6 +2160,9 @@ static int __devexit smsc911x_drv_remove(struct platform_device *pdev)
 
 	iounmap(pdata->ioaddr);
 
+	(void)smsc911x_enable_disable_resources(pdev, false);
+	smsc911x_free_resources(pdev);
+
 	free_netdev(dev);
 
 	return 0;
@@ -2218,10 +2289,20 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 	pdata->dev = dev;
 	pdata->msg_enable = ((1 << debug) - 1);
 
+	platform_set_drvdata(pdev, dev);
+
+	retval = smsc911x_request_resources(pdev);
+	if (retval)
+		goto out_return_resources;
+
+	retval = smsc911x_enable_disable_resources(pdev, true);
+	if (retval)
+		goto out_disable_resources;
+
 	if (pdata->ioaddr == NULL) {
 		SMSC_WARN(pdata, probe, "Error smsc911x base address invalid");
 		retval = -ENOMEM;
-		goto out_free_netdev_2;
+		goto out_disable_resources;
 	}
 
 	retval = smsc911x_probe_config_dt(&pdata->config, np);
@@ -2233,7 +2314,7 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 
 	if (retval) {
 		SMSC_WARN(pdata, probe, "Error smsc911x config not found");
-		goto out_unmap_io_3;
+		goto out_disable_resources;
 	}
 
 	/* assume standard, non-shifted, access to HW registers */
@@ -2244,7 +2325,7 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 
 	retval = smsc911x_init(dev);
 	if (retval < 0)
-		goto out_unmap_io_3;
+		goto out_disable_resources;
 
 	/* configure irq polarity and type before connecting isr */
 	if (pdata->config.irq_polarity == SMSC911X_IRQ_POLARITY_ACTIVE_HIGH)
@@ -2264,15 +2345,13 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 	if (retval) {
 		SMSC_WARN(pdata, probe,
 			  "Unable to claim requested irq: %d", dev->irq);
-		goto out_unmap_io_3;
+		goto out_free_irq;
 	}
 
-	platform_set_drvdata(pdev, dev);
-
 	retval = register_netdev(dev);
 	if (retval) {
 		SMSC_WARN(pdata, probe, "Error %i registering device", retval);
-		goto out_unset_drvdata_4;
+		goto out_free_irq;
 	} else {
 		SMSC_TRACE(pdata, probe,
 			   "Network interface: \"%s\"", dev->name);
@@ -2321,12 +2400,14 @@ static int __devinit smsc911x_drv_probe(struct platform_device *pdev)
 
 out_unregister_netdev_5:
 	unregister_netdev(dev);
-out_unset_drvdata_4:
-	platform_set_drvdata(pdev, NULL);
+out_free_irq:
 	free_irq(dev->irq, dev);
-out_unmap_io_3:
+out_disable_resources:
+	(void)smsc911x_enable_disable_resources(pdev, false);
+out_return_resources:
+	smsc911x_free_resources(pdev);
+	platform_set_drvdata(pdev, NULL);
 	iounmap(pdata->ioaddr);
-out_free_netdev_2:
 	free_netdev(dev);
 out_release_io_1:
 	release_mem_region(res->start, resource_size(res));
-- 
1.7.1


^ permalink raw reply related

* 工程招标8205453
From: onbbhpv @ 2011-10-31  6:40 UTC (permalink / raw)
  To: jayo, 367, shaoyunjuan, 1260806118, netdev, dior, taizhouyizhouzy,
	esdx

14:40:23你好,我们公 司有发 qs票可以优 惠对外ql开,普通、增值、服务、商业等发qa 票,(可以验证后付m款)本信 息长 期有效,联m系方式请做保留备需要时用,联 系人;vg周 小姐159-1988-6069(期 待与你的合 作)}·学而优书店:思与学的“广州湾”

^ permalink raw reply

* [PATCH] SUNRPC: remove non-exclusive pipe creation from RPC pipefs
From: Stanislav Kinsbursky @ 2011-10-31 13:07 UTC (permalink / raw)
  To: bfields, Trond.Myklebust
  Cc: linux-nfs, xemul, neilb, netdev, linux-kernel, davem, devel

This patch-set was created in context of clone of git branch:
git://git.linux-nfs.org/projects/trondmy/nfs-2.6.git
and rebased on tag "v3.1".

SUNRPC pipefs non-exclusive pipe creation code looks obsolete. IOW, as I see
it, all pipes are creating with unique full path and only once.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>

---
 include/linux/sunrpc/rpc_pipe_fs.h |    1 -
 net/sunrpc/rpc_pipe.c              |   44 +++++-------------------------------
 2 files changed, 6 insertions(+), 39 deletions(-)

diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h
index 271e1b2..80606ab 100644
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -30,7 +30,6 @@ struct rpc_inode {
 	int pipelen;
 	int nreaders;
 	int nwriters;
-	int nkern_readwriters;
 	wait_queue_head_t waitq;
 #define RPC_PIPE_WAIT_FOR_OPEN	1
 	int flags;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 42e4b6e..a1f23c4 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -558,7 +558,6 @@ static int __rpc_mkpipe(struct inode *dir, struct dentry *dentry,
 	if (err)
 		return err;
 	rpci = RPC_I(dentry->d_inode);
-	rpci->nkern_readwriters = 1;
 	rpci->private = private;
 	rpci->flags = flags;
 	rpci->ops = ops;
@@ -591,16 +590,12 @@ static int __rpc_unlink(struct inode *dir, struct dentry *dentry)
 static int __rpc_rmpipe(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = dentry->d_inode;
-	struct rpc_inode *rpci = RPC_I(inode);
 
-	rpci->nkern_readwriters--;
-	if (rpci->nkern_readwriters != 0)
-		return 0;
 	rpc_close_pipes(inode);
 	return __rpc_unlink(dir, dentry);
 }
 
-static struct dentry *__rpc_lookup_create(struct dentry *parent,
+static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
 					  struct qstr *name)
 {
 	struct dentry *dentry;
@@ -608,27 +603,13 @@ static struct dentry *__rpc_lookup_create(struct dentry *parent,
 	dentry = d_lookup(parent, name);
 	if (!dentry) {
 		dentry = d_alloc(parent, name);
-		if (!dentry) {
-			dentry = ERR_PTR(-ENOMEM);
-			goto out_err;
-		}
+		if (!dentry)
+			return ERR_PTR(-ENOMEM);
 	}
-	if (!dentry->d_inode)
+	if (dentry->d_inode == NULL) {
 		d_set_d_op(dentry, &rpc_dentry_operations);
-out_err:
-	return dentry;
-}
-
-static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
-					  struct qstr *name)
-{
-	struct dentry *dentry;
-
-	dentry = __rpc_lookup_create(parent, name);
-	if (IS_ERR(dentry))
-		return dentry;
-	if (dentry->d_inode == NULL)
 		return dentry;
+	}
 	dput(dentry);
 	return ERR_PTR(-EEXIST);
 }
@@ -816,22 +797,9 @@ struct dentry *rpc_mkpipe(struct dentry *parent, const char *name,
 	q.hash = full_name_hash(q.name, q.len),
 
 	mutex_lock_nested(&dir->i_mutex, I_MUTEX_PARENT);
-	dentry = __rpc_lookup_create(parent, &q);
+	dentry = __rpc_lookup_create_exclusive(parent, &q);
 	if (IS_ERR(dentry))
 		goto out;
-	if (dentry->d_inode) {
-		struct rpc_inode *rpci = RPC_I(dentry->d_inode);
-		if (rpci->private != private ||
-				rpci->ops != ops ||
-				rpci->flags != flags) {
-			dput (dentry);
-			err = -EBUSY;
-			goto out_err;
-		}
-		rpci->nkern_readwriters++;
-		goto out;
-	}
-
 	err = __rpc_mkpipe(dir, dentry, umode, &rpc_pipe_fops,
 			   private, ops, flags);
 	if (err)

^ permalink raw reply related

* dev->promiscuity can become negative in specific bridge + vlan configuration
From: Matthijs Kooijman @ 2011-10-31 13:44 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 2924 bytes --]

(Please CC me, I'm not subscribed)

Hi folks,

while debugging an issue on a wireless access point running Linux, I
encountered an issue with the promiscuity of interfaces when combining
vlans and bridges. When configuring them in a specific order, the
dev->promiscuity can become negative and the IFF_PROMISC flag will be
wrong.

I've reproduced this on an old 2.6.26 kernel as well as an
3.1.0-rc10-ish kernel from misc.git.

When a VLAN device is configured to be promiscuous, the underlying
physical device must be promiscuous as well. This is achieved in two
ways:

1. When a vlan device is brought up and its IFF_PROMISC flag is set, the
   promiscuity of the underlying interface is increased. When a vlan
   device is brought down and its IFF_PROMISC flag is set, the
   promiscuity of the underlying interface is decreased. This happens in
   vlan_dev_open() and vlan_dev_stop().
2. When the IFF_PROMISC flag changes on the vlan device, the promiscuity
   value of the underlying device is increased or decreased, depending
   on the value of the flag. This happens in vlan_dev_change_rx_flags().

However, 2. also happens when the interface is not up, which is
incorrect AFAICS. If a VLAN interface, which is promiscuous, is first
brought down and then has its promiscious flag reset, the promiscuity of
the underlying physical interface will be decreased twice.

This problem can be demonstrated with the following commands. It should
happen on any hardware and all recent (andn not-so-recent) kernels,
AFAICS. I've added the relevant kernel output and some comments inline
below.

$ vconfig add eth0 10
Added VLAN with VID == 10 to IF -:eth0:-
$ brctl addbr br0
$ ifconfig eth0.10 up
$ brctl addif br0 eth0.10
[14288.817391] device eth0.10 entered promiscuous mode
[14288.817397] device eth0 entered promiscuous mode
# eth0->promiscuity is now 1, as expected

$ ifconfig eth0.10 down
[14335.533019] device eth0 left promiscuous mode
# eth0->promiscuity is now 0, as expected

$ brctl delif br0 eth0.10
[14351.037666] device eth0.10 left promiscuous mode
# eth0->promiscuit is now -1!

$ brctl addif br0 eth0.10
[14383.549998] device eth0.10 entered promiscuous mode
# eth0->promiscuity is now 0, so eth0 is not entering promisciuous mode
# as it should

I've confirmed that the promiscuity actually gets set to -1 by some
added kernel prints on the 2.6.26 kernel, but the above behaviour is
also shown on the 3.1.0-rc10 kernel (which is consistent with
promiscuity diving below 0).

From looking at the code, I assume the same story applies for the
IFF_ALLMULTI flag, but I've not tested this.


I'm working on a (simple) patch to fix this issue, by simply not
updating the promiscuity of the underlying interface if the vlan
interface is down. I'll reply to this message with the patch after I've
finished and tested it.

Gr.

Matthijs

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply

* [PATCH] bonding:update speed/duplex for NETDEV_CHANGE
From: Weiping Pan @ 2011-10-31 14:19 UTC (permalink / raw)
  To: netdev; +Cc: fubar, andy, linux-kernel, Weiping Pan
In-Reply-To: <4EAE0D9A.9060408@gmail.com>

Zheng Liang(lzheng@redhat.com) found a bug that if we config bonding with
arp monitor, sometimes bonding driver cannot get the speed and duplex from
its slaves, it will assume them to be 100Mb/sec and Full, please see
/proc/net/bonding/bond0.
But there is no such problem when uses miimon.

(Take igb for example)
I find that the reason is that after dev_open() in bond_enslave(),
bond_update_speed_duplex() will call igb_get_settings()
, but in that function,
it runs ethtool_cmd_speed_set(ecmd, -1); ecmd->duplex = -1;
because igb get an error value of status.
So even dev_open() is called, but the device is not really ready to get its
settings.

Maybe it is safe for us to call igb_get_settings() only after
this message shows up, that is "igb: p4p1 NIC Link is Up 1000 Mbps Full Duplex,
Flow Control: RX".

So I prefer to update the speed and duplex for a slave when reseices
NETDEV_CHANGE/NETDEV_UP event.

Signed-off-by: Weiping Pan <wpan@redhat.com>
---
 drivers/net/bonding/bond_main.c |   19 ++++++++-----------
 1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index c34cc1e..f5458eb 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3220,6 +3220,7 @@ static int bond_slave_netdev_event(unsigned long event,
 {
 	struct net_device *bond_dev = slave_dev->master;
 	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave = NULL;
 
 	switch (event) {
 	case NETDEV_UNREGISTER:
@@ -3230,20 +3231,16 @@ static int bond_slave_netdev_event(unsigned long event,
 				bond_release(bond_dev, slave_dev);
 		}
 		break;
+	case NETDEV_UP:
 	case NETDEV_CHANGE:
-		if (bond->params.mode == BOND_MODE_8023AD || bond_is_lb(bond)) {
-			struct slave *slave;
-
-			slave = bond_get_slave_by_dev(bond, slave_dev);
-			if (slave) {
-				u32 old_speed = slave->speed;
-				u8  old_duplex = slave->duplex;
-
-				bond_update_speed_duplex(slave);
+		slave = bond_get_slave_by_dev(bond, slave_dev);
+		if (slave) {
+			u32 old_speed = slave->speed;
+			u8  old_duplex = slave->duplex;
 
-				if (bond_is_lb(bond))
-					break;
+			bond_update_speed_duplex(slave);
 
+			if (bond->params.mode == BOND_MODE_8023AD) {
 				if (old_speed != slave->speed)
 					bond_3ad_adapter_speed_changed(slave);
 				if (old_duplex != slave->duplex)
-- 
1.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox