Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next-2.6] net: sk_dst_cache RCUification
From: Eric Dumazet @ 2010-04-14  5:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, paulmck
In-Reply-To: <1271223325.16881.600.camel@edumazet-laptop>

Le mercredi 14 avril 2010 à 07:35 +0200, Eric Dumazet a écrit :
> Le mardi 13 avril 2010 à 16:11 -0700, David Miller a écrit :
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 14 Apr 2010 01:04:05 +0200
> > 
> > > Instead of using rcu on whole "struct socket", my plan is to use a small
> > > structure :
> > > 
> > > struct wait_queue_head_rcu {
> > > 	wait_queue_head_t wait;
> > > 	struct rcu_head	  rcu;
> > > } ____cacheline_aligned_in_smp;
> > > 
> > > and make sk->sk_sleep points to this 'wait' field.
> > 
> > So you're relying upon the fact that in the non-FASYNC case
> > the struct socket's wait queue is never actually used?
> 
> Yes, for the first phase of my work, by asynch handling might be RCUfied
> too in a second phase :)

Oh well, I did not really understood the question David, please ignore
the answer (I need to fully wake before...)




^ permalink raw reply

* Re: usb-sound circular locking again?
From: Takashi Iwai @ 2010-04-14  6:15 UTC (permalink / raw)
  To: Richard Zidlicky; +Cc: Andrew Morton, linux-kernel, netdev
In-Reply-To: <20100413203039.GC8468@linux-m68k.org>

At Tue, 13 Apr 2010 22:30:39 +0200,
Richard Zidlicky wrote:
> 
> Hi,
> 
> is this the same old issue?

I think so.  It appears relatively new since a sysfs lockdep check was
introduced.


thanks,

Takashi

> Any way to fix it? Seeing it triggered in a sync
> syscall does not make me comfortable.
> 
> Apr 13 02:01:36 localhost kernel: [ 8569.449882] PM: Syncing filesystems ... 
> Apr 13 02:01:36 localhost kernel: [ 8569.449998] =======================================================
> Apr 13 02:01:36 localhost kernel: [ 8569.450049] [ INFO: possible circular locking dependency detected ]
> Apr 13 02:01:36 localhost kernel: [ 8569.450078] 2.6.33.2v2 #4
> Apr 13 02:01:36 localhost kernel: [ 8569.450101] -------------------------------------------------------
> Apr 13 02:01:36 localhost kernel: [ 8569.450130] pm-hibernate/17348 is trying to acquire lock:
> Apr 13 02:01:36 localhost kernel: [ 8569.450158]  (mutex){+.+...}, at: [<c04e6670>] sync_filesystems+0x14/0xd6
> Apr 13 02:01:36 localhost kernel: [ 8569.450252] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450253] but task is already holding lock:
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]  (pm_mutex){+.+.+.}, at: [<c0466658>] hibernate+0x13/0x18d
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] which lock already depends on the new lock.
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] the existing dependency chain (in reverse order) is:
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] -> #6 (pm_mutex){+.+.+.}:
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c0736d84>] __mutex_lock_common+0x35/0x2f3
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c07370e0>] mutex_lock_nested+0x30/0x38
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c0466658>] hibernate+0x13/0x18d
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c046551c>] state_store+0x56/0xa8
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c05acb19>] kobj_attr_store+0x1a/0x22
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c050f306>] sysfs_write_file+0xb9/0xe4
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c04cc821>] vfs_write+0x84/0xdf
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c04cc915>] sys_write+0x3b/0x60
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] 
> Apr 13 02:01:36 localhost kernel: [ 8569.450266] -> #5 (s_active){++++.+}:
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c05102f8>] sysfs_addrm_finish+0x89/0xde
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c050eaf7>] sysfs_hash_and_remove+0x3d/0x4f
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c0511100>] sysfs_remove_group+0x74/0xa3
> Apr 13 02:01:36 localhost kernel: [ 8569.450266]        [<c062e16c>] dpm_sysfs_remove+0x10/0x12
> Apr 13 09:39:32 localhost kernel: [ 8569.450266]        [<c062933f>] device_del+0x33/0x154
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c0629488>] device_unregister+0x28/0x4b
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c067b7c5>] usb_remove_ep_devs+0x15/0x1f
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c0675c92>] remove_intf_ep_devs+0x21/0x32
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c0676d53>] usb_set_interface+0x18c/0x22c
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<f8302c46>] snd_usb_capture_close+0x26/0x3f [snd_usb_audio]
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<f80fbb08>] snd_pcm_release_substream+0x3d/0x66 [snd_pcm]
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<f80fbb8d>] snd_pcm_release+0x5c/0x9e [snd_pcm]
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c04cd12a>] __fput+0xf0/0x187
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c04cd1da>] fput+0x19/0x1b
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c04b2e9f>] remove_vma+0x3e/0x5d
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c04b3b2a>] do_munmap+0x23c/0x259
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c04b3b77>] sys_munmap+0x30/0x3f
> Apr 13 09:39:34 localhost kernel: [ 8569.450266]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.450266] 
> Apr 13 09:39:34 localhost kernel: [ 8569.450266] -> #4 (&pcm->open_mutex){+.+.+.}:
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c0736d84>] __mutex_lock_common+0x35/0x2f3
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c07370e0>] mutex_lock_nested+0x30/0x38
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<f80fbb86>] snd_pcm_release+0x55/0x9e [snd_pcm]
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c04cd12a>] __fput+0xf0/0x187
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c04cd1da>] fput+0x19/0x1b
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c04b2e9f>] remove_vma+0x3e/0x5d
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c04b3b2a>] do_munmap+0x23c/0x259
> Apr 13 09:39:34 localhost kernel: [ 8569.454127]        [<c04b3b77>] sys_munmap+0x30/0x3f
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.455127] 
> Apr 13 09:39:34 localhost kernel: [ 8569.455127] -> #3 (&mm->mmap_sem){++++++}:
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c04add1a>] might_fault+0x64/0x81
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c05b3828>] copy_to_user+0x2c/0xfc
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c04d784b>] filldir64+0x97/0xcd
> Apr 13 09:39:34 localhost kernel: [ 8569.455127]        [<c04e299c>] dcache_readdir+0x5a/0x1af
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04d7a5d>] vfs_readdir+0x68/0x94
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04d7aec>] sys_getdents64+0x63/0xa0
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.456129] 
> Apr 13 09:39:34 localhost kernel: [ 8569.456129] -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}:
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c0736d84>] __mutex_lock_common+0x35/0x2f3
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c07370e0>] mutex_lock_nested+0x30/0x38
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c051164f>] devpts_get_sb+0x1c0/0x29f
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04ce0db>] vfs_kern_mount+0x86/0x11f
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04ce1b8>] do_kern_mount+0x32/0xbe
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04e02c2>] do_mount+0x671/0x6d0
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c04e0382>] sys_mount+0x61/0x8f
> Apr 13 09:39:34 localhost kernel: [ 8569.456129]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.456129] 
> Apr 13 09:39:34 localhost kernel: [ 8569.456129] -> #1 (&type->s_umount_key#19){++++..}:
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c045b60a>] __lock_acquire+0xa2d/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c0737310>] down_read+0x31/0x45
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c04e66cf>] sync_filesystems+0x73/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c04e676e>] sys_sync+0x11/0x2d
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.458127] 
> Apr 13 09:39:34 localhost kernel: [ 8569.458127] -> #0 (mutex){+.+...}:
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c045b517>] __lock_acquire+0x93a/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c0736d84>] __mutex_lock_common+0x35/0x2f3
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c07370e0>] mutex_lock_nested+0x30/0x38
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c04e6670>] sync_filesystems+0x14/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c04e676e>] sys_sync+0x11/0x2d
> Apr 13 09:39:34 localhost kernel: [ 8569.458127]        [<c04666c2>] hibernate+0x7d/0x18d
> Apr 13 09:39:34 localhost kernel: [ 8569.459761]        [<c046551c>] state_store+0x56/0xa8
> Apr 13 09:39:34 localhost kernel: [ 8569.459761]        [<c05acb19>] kobj_attr_store+0x1a/0x22
> Apr 13 09:39:34 localhost kernel: [ 8569.459761]        [<c050f306>] sysfs_write_file+0xb9/0xe4
> Apr 13 09:39:34 localhost kernel: [ 8569.459761]        [<c04cc821>] vfs_write+0x84/0xdf
> Apr 13 09:39:34 localhost kernel: [ 8569.460128]        [<c04cc915>] sys_write+0x3b/0x60
> Apr 13 09:39:34 localhost kernel: [ 8569.460128]        [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.460128] 
> Apr 13 09:39:34 localhost kernel: [ 8569.460128] other info that might help us debug this:
> Apr 13 09:39:34 localhost kernel: [ 8569.460128] 
> Apr 13 09:39:34 localhost kernel: [ 8569.460128] 4 locks held by pm-hibernate/17348:
> Apr 13 09:39:34 localhost kernel: [ 8569.460128]  #0:  (&buffer->mutex){+.+.+.}, at: [<c050f272>] sysfs_write_file+0x25/0xe4
> Apr 13 09:39:34 localhost kernel: [ 8569.460128]  #1:  (s_active){++++.+}, at: [<c0510544>] sysfs_get_active_two+0x16/0x36
> Apr 13 09:39:34 localhost kernel: [ 8569.461127]  #2:  (s_active){++++.+}, at: [<c051054f>] sysfs_get_active_two+0x21/0x36
> Apr 13 09:39:34 localhost kernel: [ 8569.461127]  #3:  (pm_mutex){+.+.+.}, at: [<c0466658>] hibernate+0x13/0x18d
> Apr 13 09:39:34 localhost kernel: [ 8569.461127] 
> Apr 13 09:39:34 localhost kernel: [ 8569.461127] stack backtrace:
> Apr 13 09:39:34 localhost kernel: [ 8569.461127] Pid: 17348, comm: pm-hibernate Not tainted 2.6.33.2v2 #4
> Apr 13 09:39:34 localhost kernel: [ 8569.461127] Call Trace:
> Apr 13 09:39:34 localhost kernel: [ 8569.461127]  [<c0735b79>] ? printk+0xf/0x16
> Apr 13 09:39:34 localhost kernel: [ 8569.461127]  [<c045a8a0>] print_circular_bug+0x90/0x9c
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c045b517>] __lock_acquire+0x93a/0xbb7
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c042730d>] ? update_curr+0x177/0x17f
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c0459bf5>] ? mark_lock+0x1e/0x1ea
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c045b828>] lock_acquire+0x94/0xb1
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e6670>] ? sync_filesystems+0x14/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c0736d84>] __mutex_lock_common+0x35/0x2f3
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e6670>] ? sync_filesystems+0x14/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e3423>] ? bdi_alloc_queue_work+0x84/0xa0
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c07370e0>] mutex_lock_nested+0x30/0x38
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e6670>] ? sync_filesystems+0x14/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e6670>] sync_filesystems+0x14/0xd6
> Apr 13 09:39:34 localhost kernel: [ 8569.462128]  [<c04e676e>] sys_sync+0x11/0x2d
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c04666c2>] hibernate+0x7d/0x18d
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c04654c6>] ? state_store+0x0/0xa8
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c046551c>] state_store+0x56/0xa8
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c04654c6>] ? state_store+0x0/0xa8
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c05acb19>] kobj_attr_store+0x1a/0x22
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c050f306>] sysfs_write_file+0xb9/0xe4
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c050f24d>] ? sysfs_write_file+0x0/0xe4
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c04cc821>] vfs_write+0x84/0xdf
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c04cc915>] sys_write+0x3b/0x60
> Apr 13 09:39:34 localhost kernel: [ 8569.463127]  [<c0738295>] syscall_call+0x7/0xb
> Apr 13 09:39:34 localhost kernel: [ 8569.484133] done.
> 
> Apr 13 09:39:34 localhost kernel: [ 8569.484223] Freezing user space processes ... (elapsed 0.04 seconds) done.
> Apr 13 09:39:34 localhost kernel: [ 8569.528142] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> Apr 13 09:39:34 localhost kernel: [ 8569.539272] PM: Preallocating image memory... done (allocated 349210 pages)
> Apr 13 09:39:34 localhost kernel: [ 8583.627118] PM: Allocated 1396840 kbytes in 14.08 seconds (99.20 MB/s)
> 
> Regards,
> Richard
> 

^ permalink raw reply

* [net-next 3/7] stmmac: fix Transmit FIFO flush operation
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-2-git-send-email-peppe.cavallaro@st.com>

Fix the Transmit FIFO flush operation; it was
disabled while reworking the descriptor structures.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h        |    1 +
 drivers/net/stmmac/dwmac1000.h     |    1 -
 drivers/net/stmmac/dwmac1000_dma.c |    9 ---------
 drivers/net/stmmac/dwmac_dma.h     |    1 +
 drivers/net/stmmac/dwmac_lib.c     |    7 +++++++
 drivers/net/stmmac/enh_desc.c      |    6 +++---
 6 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index bd3b785..27a05b4 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -244,3 +244,4 @@ extern void stmmac_set_mac_addr(unsigned long ioaddr, u8 addr[6],
 				unsigned int high, unsigned int low);
 extern void stmmac_get_mac_addr(unsigned long ioaddr, unsigned char *addr,
 				unsigned int high, unsigned int low);
+extern void dwmac_dma_flush_tx_fifo(unsigned long ioaddr);
diff --git a/drivers/net/stmmac/dwmac1000.h b/drivers/net/stmmac/dwmac1000.h
index 3b784fc..d8d0f35 100644
--- a/drivers/net/stmmac/dwmac1000.h
+++ b/drivers/net/stmmac/dwmac1000.h
@@ -172,7 +172,6 @@ enum rfd {
 	deac_full_minus_4 = 0x00401800,
 };
 #define DMA_CONTROL_TSF		0x00200000 /* Transmit  Store and Forward */
-#define DMA_CONTROL_FTF		0x00100000 /* Flush transmit FIFO */
 
 enum ttc_control {
 	DMA_CONTROL_TTC_64 = 0x00000000,
diff --git a/drivers/net/stmmac/dwmac1000_dma.c b/drivers/net/stmmac/dwmac1000_dma.c
index 8d3ea99..a547aa9 100644
--- a/drivers/net/stmmac/dwmac1000_dma.c
+++ b/drivers/net/stmmac/dwmac1000_dma.c
@@ -58,15 +58,6 @@ static int dwmac1000_dma_init(unsigned long ioaddr, int pbl, u32 dma_tx,
 	return 0;
 }
 
-/* Transmit FIFO flush operation */
-static void dwmac1000_flush_tx_fifo(unsigned long ioaddr)
-{
-	u32 csr6 = readl(ioaddr + DMA_CONTROL);
-	writel((csr6 | DMA_CONTROL_FTF), ioaddr + DMA_CONTROL);
-
-	do {} while ((readl(ioaddr + DMA_CONTROL) & DMA_CONTROL_FTF));
-}
-
 static void dwmac1000_dma_operation_mode(unsigned long ioaddr, int txmode,
 				    int rxmode)
 {
diff --git a/drivers/net/stmmac/dwmac_dma.h b/drivers/net/stmmac/dwmac_dma.h
index de848d9..7b815a1 100644
--- a/drivers/net/stmmac/dwmac_dma.h
+++ b/drivers/net/stmmac/dwmac_dma.h
@@ -95,6 +95,7 @@
 #define DMA_STATUS_TU	0x00000004	/* Transmit Buffer Unavailable */
 #define DMA_STATUS_TPS	0x00000002	/* Transmit Process Stopped */
 #define DMA_STATUS_TI	0x00000001	/* Transmit Interrupt */
+#define DMA_CONTROL_FTF		0x00100000 /* Flush transmit FIFO */
 
 extern void dwmac_enable_dma_transmission(unsigned long ioaddr);
 extern void dwmac_enable_dma_irq(unsigned long ioaddr);
diff --git a/drivers/net/stmmac/dwmac_lib.c b/drivers/net/stmmac/dwmac_lib.c
index d4adb1e..0a504ad 100644
--- a/drivers/net/stmmac/dwmac_lib.c
+++ b/drivers/net/stmmac/dwmac_lib.c
@@ -227,6 +227,13 @@ int dwmac_dma_interrupt(unsigned long ioaddr,
 	return ret;
 }
 
+void dwmac_dma_flush_tx_fifo(unsigned long ioaddr)
+{
+	u32 csr6 = readl(ioaddr + DMA_CONTROL);
+	writel((csr6 | DMA_CONTROL_FTF), ioaddr + DMA_CONTROL);
+
+	do {} while ((readl(ioaddr + DMA_CONTROL) & DMA_CONTROL_FTF));
+}
 
 void stmmac_set_mac_addr(unsigned long ioaddr, u8 addr[6],
 			 unsigned int high, unsigned int low)
diff --git a/drivers/net/stmmac/enh_desc.c b/drivers/net/stmmac/enh_desc.c
index e5ac259..eb5684a 100644
--- a/drivers/net/stmmac/enh_desc.c
+++ b/drivers/net/stmmac/enh_desc.c
@@ -40,7 +40,7 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 		if (unlikely(p->des01.etx.frame_flushed)) {
 			CHIP_DBG(KERN_ERR "\tframe_flushed error\n");
 			x->tx_frame_flushed++;
-			/*enh_desc_flush_tx_fifo(ioaddr);*/
+			dwmac_dma_flush_tx_fifo(ioaddr);
 		}
 
 		if (unlikely(p->des01.etx.loss_carrier)) {
@@ -68,7 +68,7 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 
 		if (unlikely(p->des01.etx.underflow_error)) {
 			CHIP_DBG(KERN_ERR "\tunderflow error\n");
-			/*enh_desc_flush_tx_fifo(ioaddr);*/
+			dwmac_dma_flush_tx_fifo(ioaddr);
 			x->tx_underflow++;
 		}
 
@@ -80,7 +80,7 @@ static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
 		if (unlikely(p->des01.etx.payload_error)) {
 			CHIP_DBG(KERN_ERR "\tAddr/Payload csum error\n");
 			x->tx_payload_error++;
-			/*enh_desc_flush_tx_fifo(ioaddr);*/
+			dwmac_dma_flush_tx_fifo(ioaddr);
 		}
 
 		ret = -1;
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 4/7] stmmac: new descriptor field for the driver's platform
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-3-git-send-email-peppe.cavallaro@st.com>

The new enh_desc is used for selecting the enhanced descriptors
structure. There are several scenarios; some chips (mac10/100
or gmac) want to use the enhanced descriptors; others want the normal
ones.
For example, on ST platforms: MAC10/100 uses the normal desc structure
and the GMAC uses the enhanced one.
It can be useful to get this information from the platform.
This could also be decided at run-time looking at the chip's ID number;
but it could happen that chips with the same ID want to use different
descriptor structure.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 include/linux/stmmac.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index 32bfd1a..632ff7c 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -33,6 +33,7 @@ struct plat_stmmacenet_data {
 	int bus_id;
 	int pbl;
 	int has_gmac;
+	int enh_desc;
 	void (*fix_mac_speed)(void *priv, unsigned int speed);
 	void (*bus_setup)(unsigned long ioaddr);
 #ifdef CONFIG_STM_DRIVERS
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 2/7] stmmac: rework normal and enhanced descriptors
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-1-git-send-email-peppe.cavallaro@st.com>

Currently the driver assumes that the mac10/100 can only use the
normal descriptor structure and the gmac can only use the
enhanced structures.
This patch removes the descriptor's code from the dma files
and adds two new files just for handling the normal and enhanced
descriptors.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/Makefile         |    2 +-
 drivers/net/stmmac/common.h         |   15 ++-
 drivers/net/stmmac/dwmac100.h       |   12 --
 drivers/net/stmmac/dwmac1000.h      |   11 -
 drivers/net/stmmac/dwmac1000_core.c |   27 ++--
 drivers/net/stmmac/dwmac1000_dma.c  |  327 +---------------------------------
 drivers/net/stmmac/dwmac100_core.c  |    3 +-
 drivers/net/stmmac/dwmac100_dma.c   |  223 +----------------------
 drivers/net/stmmac/enh_desc.c       |  342 +++++++++++++++++++++++++++++++++++
 drivers/net/stmmac/norm_desc.c      |  240 ++++++++++++++++++++++++
 drivers/net/stmmac/stmmac.h         |    2 +
 drivers/net/stmmac/stmmac_main.c    |    7 +-
 12 files changed, 627 insertions(+), 584 deletions(-)
 create mode 100644 drivers/net/stmmac/enh_desc.c
 create mode 100644 drivers/net/stmmac/norm_desc.c

diff --git a/drivers/net/stmmac/Makefile b/drivers/net/stmmac/Makefile
index b14bd56..9691733 100644
--- a/drivers/net/stmmac/Makefile
+++ b/drivers/net/stmmac/Makefile
@@ -2,4 +2,4 @@ obj-$(CONFIG_STMMAC_ETH) += stmmac.o
 stmmac-$(CONFIG_STMMAC_TIMER) += stmmac_timer.o
 stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o	\
 	      dwmac_lib.o dwmac1000_core.o  dwmac1000_dma.o	\
-	      dwmac100_core.o dwmac100_dma.o $(stmmac-y)
+	      dwmac100_core.o dwmac100_dma.o enh_desc.o  norm_desc.o $(stmmac-y)
diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 2a58172..bd3b785 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -22,8 +22,21 @@
   Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
 *******************************************************************************/
 
-#include "descs.h"
 #include <linux/netdevice.h>
+#include "descs.h"
+
+#undef CHIP_DEBUG_PRINT
+/* Turn-on extra printk debug for MAC core, dma and descriptors */
+/* #define CHIP_DEBUG_PRINT */
+
+#ifdef CHIP_DEBUG_PRINT
+#define CHIP_DBG(fmt, args...)  printk(fmt, ## args)
+#else
+#define CHIP_DBG(fmt, args...)  do { } while (0)
+#endif
+
+#undef FRAME_FILTER_DEBUG
+/* #define FRAME_FILTER_DEBUG */
 
 struct stmmac_extra_stats {
 	/* Transmit errors */
diff --git a/drivers/net/stmmac/dwmac100.h b/drivers/net/stmmac/dwmac100.h
index 9f4ba2e..97956cb 100644
--- a/drivers/net/stmmac/dwmac100.h
+++ b/drivers/net/stmmac/dwmac100.h
@@ -118,16 +118,4 @@ enum ttc_control {
 #define DMA_MISSED_FRAME_OVE_M	0x00010000	/* Missed Frame Overflow */
 #define DMA_MISSED_FRAME_M_CNTR	0x0000ffff	/* Missed Frame Couinter */
 
-#undef DWMAC100_DEBUG
-/* #define DWMAC100__DEBUG */
-#undef FRAME_FILTER_DEBUG
-/* #define FRAME_FILTER_DEBUG */
-#ifdef DWMAC100__DEBUG
-#define DBG(fmt, args...)  printk(fmt, ## args)
-#else
-#define DBG(fmt, args...)  do { } while (0)
-#endif
-
 extern struct stmmac_dma_ops dwmac100_dma_ops;
-extern struct stmmac_desc_ops dwmac100_desc_ops;
-
diff --git a/drivers/net/stmmac/dwmac1000.h b/drivers/net/stmmac/dwmac1000.h
index 62dca0e..3b784fc 100644
--- a/drivers/net/stmmac/dwmac1000.h
+++ b/drivers/net/stmmac/dwmac1000.h
@@ -206,15 +206,4 @@ enum rtc_control {
 #define GMAC_MMC_TX_INTR   0x108
 #define GMAC_MMC_RX_CSUM_OFFLOAD   0x208
 
-#undef DWMAC1000_DEBUG
-/* #define DWMAC1000__DEBUG */
-#undef FRAME_FILTER_DEBUG
-/* #define FRAME_FILTER_DEBUG */
-#ifdef DWMAC1000__DEBUG
-#define DBG(fmt, args...)  printk(fmt, ## args)
-#else
-#define DBG(fmt, args...)  do { } while (0)
-#endif
-
 extern struct stmmac_dma_ops dwmac1000_dma_ops;
-extern struct stmmac_desc_ops dwmac1000_desc_ops;
diff --git a/drivers/net/stmmac/dwmac1000_core.c b/drivers/net/stmmac/dwmac1000_core.c
index f9c7c1c..0aa89ae 100644
--- a/drivers/net/stmmac/dwmac1000_core.c
+++ b/drivers/net/stmmac/dwmac1000_core.c
@@ -83,8 +83,8 @@ static void dwmac1000_set_filter(struct net_device *dev)
 	unsigned long ioaddr = dev->base_addr;
 	unsigned int value = 0;
 
-	DBG(KERN_INFO "%s: # mcasts %d, # unicast %d\n",
-	    __func__, netdev_mc_count(dev), netdev_uc_count(dev));
+	CHIP_DBG(KERN_INFO "%s: # mcasts %d, # unicast %d\n",
+		 __func__, netdev_mc_count(dev), netdev_uc_count(dev));
 
 	if (dev->flags & IFF_PROMISC)
 		value = GMAC_FRAME_FILTER_PR;
@@ -136,7 +136,7 @@ static void dwmac1000_set_filter(struct net_device *dev)
 #endif
 	writel(value, ioaddr + GMAC_FRAME_FILTER);
 
-	DBG(KERN_INFO "\tFrame Filter reg: 0x%08x\n\tHash regs: "
+	CHIP_DBG(KERN_INFO "\tFrame Filter reg: 0x%08x\n\tHash regs: "
 	    "HI 0x%08x, LO 0x%08x\n", readl(ioaddr + GMAC_FRAME_FILTER),
 	    readl(ioaddr + GMAC_HASH_HIGH), readl(ioaddr + GMAC_HASH_LOW));
 
@@ -148,18 +148,18 @@ static void dwmac1000_flow_ctrl(unsigned long ioaddr, unsigned int duplex,
 {
 	unsigned int flow = 0;
 
-	DBG(KERN_DEBUG "GMAC Flow-Control:\n");
+	CHIP_DBG(KERN_DEBUG "GMAC Flow-Control:\n");
 	if (fc & FLOW_RX) {
-		DBG(KERN_DEBUG "\tReceive Flow-Control ON\n");
+		CHIP_DBG(KERN_DEBUG "\tReceive Flow-Control ON\n");
 		flow |= GMAC_FLOW_CTRL_RFE;
 	}
 	if (fc & FLOW_TX) {
-		DBG(KERN_DEBUG "\tTransmit Flow-Control ON\n");
+		CHIP_DBG(KERN_DEBUG "\tTransmit Flow-Control ON\n");
 		flow |= GMAC_FLOW_CTRL_TFE;
 	}
 
 	if (duplex) {
-		DBG(KERN_DEBUG "\tduplex mode: pause time: %d\n", pause_time);
+		CHIP_DBG(KERN_DEBUG "\tduplex mode: PAUSE %d\n", pause_time);
 		flow |= (pause_time << GMAC_FLOW_CTRL_PT_SHIFT);
 	}
 
@@ -172,10 +172,10 @@ static void dwmac1000_pmt(unsigned long ioaddr, unsigned long mode)
 	unsigned int pmt = 0;
 
 	if (mode == WAKE_MAGIC) {
-		DBG(KERN_DEBUG "GMAC: WOL Magic frame\n");
+		CHIP_DBG(KERN_DEBUG "GMAC: WOL Magic frame\n");
 		pmt |= power_down | magic_pkt_en;
 	} else if (mode == WAKE_UCAST) {
-		DBG(KERN_DEBUG "GMAC: WOL on global unicast\n");
+		CHIP_DBG(KERN_DEBUG "GMAC: WOL on global unicast\n");
 		pmt |= global_unicast;
 	}
 
@@ -190,16 +190,16 @@ static void dwmac1000_irq_status(unsigned long ioaddr)
 
 	/* Not used events (e.g. MMC interrupts) are not handled. */
 	if ((intr_status & mmc_tx_irq))
-		DBG(KERN_DEBUG "GMAC: MMC tx interrupt: 0x%08x\n",
+		CHIP_DBG(KERN_DEBUG "GMAC: MMC tx interrupt: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_TX_INTR));
 	if (unlikely(intr_status & mmc_rx_irq))
-		DBG(KERN_DEBUG "GMAC: MMC rx interrupt: 0x%08x\n",
+		CHIP_DBG(KERN_DEBUG "GMAC: MMC rx interrupt: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_RX_INTR));
 	if (unlikely(intr_status & mmc_rx_csum_offload_irq))
-		DBG(KERN_DEBUG "GMAC: MMC rx csum offload: 0x%08x\n",
+		CHIP_DBG(KERN_DEBUG "GMAC: MMC rx csum offload: 0x%08x\n",
 		    readl(ioaddr + GMAC_MMC_RX_CSUM_OFFLOAD));
 	if (unlikely(intr_status & pmt_irq)) {
-		DBG(KERN_DEBUG "GMAC: received Magic frame\n");
+		CHIP_DBG(KERN_DEBUG "GMAC: received Magic frame\n");
 		/* clear the PMT bits 5 and 6 by reading the PMT
 		 * status register. */
 		readl(ioaddr + GMAC_PMT);
@@ -230,7 +230,6 @@ struct mac_device_info *dwmac1000_setup(unsigned long ioaddr)
 	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
 
 	mac->mac = &dwmac1000_ops;
-	mac->desc = &dwmac1000_desc_ops;
 	mac->dma = &dwmac1000_dma_ops;
 
 	mac->pmt = PMT_SUPPORTED;
diff --git a/drivers/net/stmmac/dwmac1000_dma.c b/drivers/net/stmmac/dwmac1000_dma.c
index 39d436a..8d3ea99 100644
--- a/drivers/net/stmmac/dwmac1000_dma.c
+++ b/drivers/net/stmmac/dwmac1000_dma.c
@@ -3,7 +3,7 @@
   DWC Ether MAC 10/100/1000 Universal version 3.41a  has been used for
   developing this code.
 
-  This contains the functions to handle the dma and descriptors.
+  This contains the functions to handle the dma.
 
   Copyright (C) 2007-2009  STMicroelectronics Ltd
 
@@ -73,14 +73,14 @@ static void dwmac1000_dma_operation_mode(unsigned long ioaddr, int txmode,
 	u32 csr6 = readl(ioaddr + DMA_CONTROL);
 
 	if (txmode == SF_DMA_MODE) {
-		DBG(KERN_DEBUG "GMAC: enabling TX store and forward mode\n");
+		CHIP_DBG(KERN_DEBUG "GMAC: enable TX store and forward mode\n");
 		/* Transmit COE type 2 cannot be done in cut-through mode. */
 		csr6 |= DMA_CONTROL_TSF;
 		/* Operating on second frame increase the performance
 		 * especially when transmit store-and-forward is used.*/
 		csr6 |= DMA_CONTROL_OSF;
 	} else {
-		DBG(KERN_DEBUG "GMAC: disabling TX store and forward mode"
+		CHIP_DBG(KERN_DEBUG "GMAC: disabling TX store and forward mode"
 			      " (threshold = %d)\n", txmode);
 		csr6 &= ~DMA_CONTROL_TSF;
 		csr6 &= DMA_CONTROL_TC_TX_MASK;
@@ -98,10 +98,10 @@ static void dwmac1000_dma_operation_mode(unsigned long ioaddr, int txmode,
 	}
 
 	if (rxmode == SF_DMA_MODE) {
-		DBG(KERN_DEBUG "GMAC: enabling RX store and forward mode\n");
+		CHIP_DBG(KERN_DEBUG "GMAC: enable RX store and forward mode\n");
 		csr6 |= DMA_CONTROL_RSF;
 	} else {
-		DBG(KERN_DEBUG "GMAC: disabling RX store and forward mode"
+		CHIP_DBG(KERN_DEBUG "GMAC: disabling RX store and forward mode"
 			      " (threshold = %d)\n", rxmode);
 		csr6 &= ~DMA_CONTROL_RSF;
 		csr6 &= DMA_CONTROL_TC_RX_MASK;
@@ -141,305 +141,6 @@ static void dwmac1000_dump_dma_regs(unsigned long ioaddr)
 	return;
 }
 
-static int dwmac1000_get_tx_frame_status(void *data,
-				struct stmmac_extra_stats *x,
-				struct dma_desc *p, unsigned long ioaddr)
-{
-	int ret = 0;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.etx.error_summary)) {
-		DBG(KERN_ERR "GMAC TX error... 0x%08x\n", p->des01.etx);
-		if (unlikely(p->des01.etx.jabber_timeout)) {
-			DBG(KERN_ERR "\tjabber_timeout error\n");
-			x->tx_jabber++;
-		}
-
-		if (unlikely(p->des01.etx.frame_flushed)) {
-			DBG(KERN_ERR "\tframe_flushed error\n");
-			x->tx_frame_flushed++;
-			dwmac1000_flush_tx_fifo(ioaddr);
-		}
-
-		if (unlikely(p->des01.etx.loss_carrier)) {
-			DBG(KERN_ERR "\tloss_carrier error\n");
-			x->tx_losscarrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely(p->des01.etx.no_carrier)) {
-			DBG(KERN_ERR "\tno_carrier error\n");
-			x->tx_carrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely(p->des01.etx.late_collision)) {
-			DBG(KERN_ERR "\tlate_collision error\n");
-			stats->collisions += p->des01.etx.collision_count;
-		}
-		if (unlikely(p->des01.etx.excessive_collisions)) {
-			DBG(KERN_ERR "\texcessive_collisions\n");
-			stats->collisions += p->des01.etx.collision_count;
-		}
-		if (unlikely(p->des01.etx.excessive_deferral)) {
-			DBG(KERN_INFO "\texcessive tx_deferral\n");
-			x->tx_deferred++;
-		}
-
-		if (unlikely(p->des01.etx.underflow_error)) {
-			DBG(KERN_ERR "\tunderflow error\n");
-			dwmac1000_flush_tx_fifo(ioaddr);
-			x->tx_underflow++;
-		}
-
-		if (unlikely(p->des01.etx.ip_header_error)) {
-			DBG(KERN_ERR "\tTX IP header csum error\n");
-			x->tx_ip_header_error++;
-		}
-
-		if (unlikely(p->des01.etx.payload_error)) {
-			DBG(KERN_ERR "\tAddr/Payload csum error\n");
-			x->tx_payload_error++;
-			dwmac1000_flush_tx_fifo(ioaddr);
-		}
-
-		ret = -1;
-	}
-
-	if (unlikely(p->des01.etx.deferred)) {
-		DBG(KERN_INFO "GMAC TX status: tx deferred\n");
-		x->tx_deferred++;
-	}
-#ifdef STMMAC_VLAN_TAG_USED
-	if (p->des01.etx.vlan_frame) {
-		DBG(KERN_INFO "GMAC TX status: VLAN frame\n");
-		x->tx_vlan++;
-	}
-#endif
-
-	return ret;
-}
-
-static int dwmac1000_get_tx_len(struct dma_desc *p)
-{
-	return p->des01.etx.buffer1_size;
-}
-
-static int dwmac1000_coe_rdes0(int ipc_err, int type, int payload_err)
-{
-	int ret = good_frame;
-	u32 status = (type << 2 | ipc_err << 1 | payload_err) & 0x7;
-
-	/* bits 5 7 0 | Frame status
-	 * ----------------------------------------------------------
-	 *      0 0 0 | IEEE 802.3 Type frame (length < 1536 octects)
-	 *      1 0 0 | IPv4/6 No CSUM errorS.
-	 *      1 0 1 | IPv4/6 CSUM PAYLOAD error
-	 *      1 1 0 | IPv4/6 CSUM IP HR error
-	 *      1 1 1 | IPv4/6 IP PAYLOAD AND HEADER errorS
-	 *      0 0 1 | IPv4/6 unsupported IP PAYLOAD
-	 *      0 1 1 | COE bypassed.. no IPv4/6 frame
-	 *      0 1 0 | Reserved.
-	 */
-	if (status == 0x0) {
-		DBG(KERN_INFO "RX Des0 status: IEEE 802.3 Type frame.\n");
-		ret = good_frame;
-	} else if (status == 0x4) {
-		DBG(KERN_INFO "RX Des0 status: IPv4/6 No CSUM errorS.\n");
-		ret = good_frame;
-	} else if (status == 0x5) {
-		DBG(KERN_ERR "RX Des0 status: IPv4/6 Payload Error.\n");
-		ret = csum_none;
-	} else if (status == 0x6) {
-		DBG(KERN_ERR "RX Des0 status: IPv4/6 Header Error.\n");
-		ret = csum_none;
-	} else if (status == 0x7) {
-		DBG(KERN_ERR
-		    "RX Des0 status: IPv4/6 Header and Payload Error.\n");
-		ret = csum_none;
-	} else if (status == 0x1) {
-		DBG(KERN_ERR
-		    "RX Des0 status: IPv4/6 unsupported IP PAYLOAD.\n");
-		ret = discard_frame;
-	} else if (status == 0x3) {
-		DBG(KERN_ERR "RX Des0 status: No IPv4, IPv6 frame.\n");
-		ret = discard_frame;
-	}
-	return ret;
-}
-
-static int dwmac1000_get_rx_frame_status(void *data,
-			struct stmmac_extra_stats *x, struct dma_desc *p)
-{
-	int ret = good_frame;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.erx.error_summary)) {
-		DBG(KERN_ERR "GMAC RX Error Summary... 0x%08x\n", p->des01.erx);
-		if (unlikely(p->des01.erx.descriptor_error)) {
-			DBG(KERN_ERR "\tdescriptor error\n");
-			x->rx_desc++;
-			stats->rx_length_errors++;
-		}
-		if (unlikely(p->des01.erx.overflow_error)) {
-			DBG(KERN_ERR "\toverflow error\n");
-			x->rx_gmac_overflow++;
-		}
-
-		if (unlikely(p->des01.erx.ipc_csum_error))
-			DBG(KERN_ERR "\tIPC Csum Error/Giant frame\n");
-
-		if (unlikely(p->des01.erx.late_collision)) {
-			DBG(KERN_ERR "\tlate_collision error\n");
-			stats->collisions++;
-			stats->collisions++;
-		}
-		if (unlikely(p->des01.erx.receive_watchdog)) {
-			DBG(KERN_ERR "\treceive_watchdog error\n");
-			x->rx_watchdog++;
-		}
-		if (unlikely(p->des01.erx.error_gmii)) {
-			DBG(KERN_ERR "\tReceive Error\n");
-			x->rx_mii++;
-		}
-		if (unlikely(p->des01.erx.crc_error)) {
-			DBG(KERN_ERR "\tCRC error\n");
-			x->rx_crc++;
-			stats->rx_crc_errors++;
-		}
-		ret = discard_frame;
-	}
-
-	/* After a payload csum error, the ES bit is set.
-	 * It doesn't match with the information reported into the databook.
-	 * At any rate, we need to understand if the CSUM hw computation is ok
-	 * and report this info to the upper layers. */
-	ret = dwmac1000_coe_rdes0(p->des01.erx.ipc_csum_error,
-		p->des01.erx.frame_type, p->des01.erx.payload_csum_error);
-
-	if (unlikely(p->des01.erx.dribbling)) {
-		DBG(KERN_ERR "GMAC RX: dribbling error\n");
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.erx.sa_filter_fail)) {
-		DBG(KERN_ERR "GMAC RX : Source Address filter fail\n");
-		x->sa_rx_filter_fail++;
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.erx.da_filter_fail)) {
-		DBG(KERN_ERR "GMAC RX : Destination Address filter fail\n");
-		x->da_rx_filter_fail++;
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.erx.length_error)) {
-		DBG(KERN_ERR "GMAC RX: length_error error\n");
-		x->rx_length++;
-		ret = discard_frame;
-	}
-#ifdef STMMAC_VLAN_TAG_USED
-	if (p->des01.erx.vlan_tag) {
-		DBG(KERN_INFO "GMAC RX: VLAN frame tagged\n");
-		x->rx_vlan++;
-	}
-#endif
-	return ret;
-}
-
-static void dwmac1000_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
-				int disable_rx_ic)
-{
-	int i;
-	for (i = 0; i < ring_size; i++) {
-		p->des01.erx.own = 1;
-		p->des01.erx.buffer1_size = BUF_SIZE_8KiB - 1;
-		/* To support jumbo frames */
-		p->des01.erx.buffer2_size = BUF_SIZE_8KiB - 1;
-		if (i == ring_size - 1)
-			p->des01.erx.end_ring = 1;
-		if (disable_rx_ic)
-			p->des01.erx.disable_ic = 1;
-		p++;
-	}
-	return;
-}
-
-static void dwmac1000_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
-{
-	int i;
-
-	for (i = 0; i < ring_size; i++) {
-		p->des01.etx.own = 0;
-		if (i == ring_size - 1)
-			p->des01.etx.end_ring = 1;
-		p++;
-	}
-
-	return;
-}
-
-static int dwmac1000_get_tx_owner(struct dma_desc *p)
-{
-	return p->des01.etx.own;
-}
-
-static int dwmac1000_get_rx_owner(struct dma_desc *p)
-{
-	return p->des01.erx.own;
-}
-
-static void dwmac1000_set_tx_owner(struct dma_desc *p)
-{
-	p->des01.etx.own = 1;
-}
-
-static void dwmac1000_set_rx_owner(struct dma_desc *p)
-{
-	p->des01.erx.own = 1;
-}
-
-static int dwmac1000_get_tx_ls(struct dma_desc *p)
-{
-	return p->des01.etx.last_segment;
-}
-
-static void dwmac1000_release_tx_desc(struct dma_desc *p)
-{
-	int ter = p->des01.etx.end_ring;
-
-	memset(p, 0, sizeof(struct dma_desc));
-	p->des01.etx.end_ring = ter;
-
-	return;
-}
-
-static void dwmac1000_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
-				 int csum_flag)
-{
-	p->des01.etx.first_segment = is_fs;
-	if (unlikely(len > BUF_SIZE_4KiB)) {
-		p->des01.etx.buffer1_size = BUF_SIZE_4KiB;
-		p->des01.etx.buffer2_size = len - BUF_SIZE_4KiB;
-	} else {
-		p->des01.etx.buffer1_size = len;
-	}
-	if (likely(csum_flag))
-		p->des01.etx.checksum_insertion = cic_full;
-}
-
-static void dwmac1000_clear_tx_ic(struct dma_desc *p)
-{
-	p->des01.etx.interrupt = 0;
-}
-
-static void dwmac1000_close_tx_desc(struct dma_desc *p)
-{
-	p->des01.etx.last_segment = 1;
-	p->des01.etx.interrupt = 1;
-}
-
-static int dwmac1000_get_rx_frame_len(struct dma_desc *p)
-{
-	return p->des01.erx.frame_length;
-}
-
 struct stmmac_dma_ops dwmac1000_dma_ops = {
 	.init = dwmac1000_dma_init,
 	.dump_regs = dwmac1000_dump_dma_regs,
@@ -454,21 +155,3 @@ struct stmmac_dma_ops dwmac1000_dma_ops = {
 	.stop_rx = dwmac_dma_stop_rx,
 	.dma_interrupt = dwmac_dma_interrupt,
 };
-
-struct stmmac_desc_ops dwmac1000_desc_ops = {
-	.tx_status = dwmac1000_get_tx_frame_status,
-	.rx_status = dwmac1000_get_rx_frame_status,
-	.get_tx_len = dwmac1000_get_tx_len,
-	.init_rx_desc = dwmac1000_init_rx_desc,
-	.init_tx_desc = dwmac1000_init_tx_desc,
-	.get_tx_owner = dwmac1000_get_tx_owner,
-	.get_rx_owner = dwmac1000_get_rx_owner,
-	.release_tx_desc = dwmac1000_release_tx_desc,
-	.prepare_tx_desc = dwmac1000_prepare_tx_desc,
-	.clear_tx_ic = dwmac1000_clear_tx_ic,
-	.close_tx_desc = dwmac1000_close_tx_desc,
-	.get_tx_ls = dwmac1000_get_tx_ls,
-	.set_tx_owner = dwmac1000_set_tx_owner,
-	.set_rx_owner = dwmac1000_set_rx_owner,
-	.get_rx_frame_len = dwmac1000_get_rx_frame_len,
-};
diff --git a/drivers/net/stmmac/dwmac100_core.c b/drivers/net/stmmac/dwmac100_core.c
index 8ecb8c0..fab14a4 100644
--- a/drivers/net/stmmac/dwmac100_core.c
+++ b/drivers/net/stmmac/dwmac100_core.c
@@ -141,7 +141,7 @@ static void dwmac100_set_filter(struct net_device *dev)
 
 	writel(value, ioaddr + MAC_CONTROL);
 
-	DBG(KERN_INFO "%s: CTRL reg: 0x%08x Hash regs: "
+	CHIP_DBG(KERN_INFO "%s: CTRL reg: 0x%08x Hash regs: "
 	    "HI 0x%08x, LO 0x%08x\n",
 	    __func__, readl(ioaddr + MAC_CONTROL),
 	    readl(ioaddr + MAC_HASH_HIGH), readl(ioaddr + MAC_HASH_LOW));
@@ -188,7 +188,6 @@ struct mac_device_info *dwmac100_setup(unsigned long ioaddr)
 	pr_info("\tDWMAC100\n");
 
 	mac->mac = &dwmac100_ops;
-	mac->desc = &dwmac100_desc_ops;
 	mac->dma = &dwmac100_dma_ops;
 
 	mac->pmt = PMT_NOT_SUPPORTED;
diff --git a/drivers/net/stmmac/dwmac100_dma.c b/drivers/net/stmmac/dwmac100_dma.c
index 7fcc526..96d098d 100644
--- a/drivers/net/stmmac/dwmac100_dma.c
+++ b/drivers/net/stmmac/dwmac100_dma.c
@@ -5,7 +5,7 @@
   DWC Ether MAC 10/100 Universal version 4.0 has been used for developing
   this code.
 
-  This contains the functions to handle the dma and descriptors.
+  This contains the functions to handle the dma.
 
   Copyright (C) 2007-2009  STMicroelectronics Ltd
 
@@ -79,14 +79,14 @@ static void dwmac100_dump_dma_regs(unsigned long ioaddr)
 {
 	int i;
 
-	DBG(KERN_DEBUG "DWMAC 100 DMA CSR\n");
+	CHIP_DBG(KERN_DEBUG "DWMAC 100 DMA CSR\n");
 	for (i = 0; i < 9; i++)
 		pr_debug("\t CSR%d (offset 0x%x): 0x%08x\n", i,
 		       (DMA_BUS_MODE + i * 4),
 		       readl(ioaddr + DMA_BUS_MODE + i * 4));
-	DBG(KERN_DEBUG "\t CSR20 (offset 0x%x): 0x%08x\n",
+	CHIP_DBG(KERN_DEBUG "\t CSR20 (offset 0x%x): 0x%08x\n",
 	    DMA_CUR_TX_BUF_ADDR, readl(ioaddr + DMA_CUR_TX_BUF_ADDR));
-	DBG(KERN_DEBUG "\t CSR21 (offset 0x%x): 0x%08x\n",
+	CHIP_DBG(KERN_DEBUG "\t CSR21 (offset 0x%x): 0x%08x\n",
 	    DMA_CUR_RX_BUF_ADDR, readl(ioaddr + DMA_CUR_RX_BUF_ADDR));
 	return;
 }
@@ -122,203 +122,6 @@ static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x,
 	return;
 }
 
-static int dwmac100_get_tx_status(void *data, struct stmmac_extra_stats *x,
-				  struct dma_desc *p, unsigned long ioaddr)
-{
-	int ret = 0;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.tx.error_summary)) {
-		if (unlikely(p->des01.tx.underflow_error)) {
-			x->tx_underflow++;
-			stats->tx_fifo_errors++;
-		}
-		if (unlikely(p->des01.tx.no_carrier)) {
-			x->tx_carrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely(p->des01.tx.loss_carrier)) {
-			x->tx_losscarrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely((p->des01.tx.excessive_deferral) ||
-			     (p->des01.tx.excessive_collisions) ||
-			     (p->des01.tx.late_collision)))
-			stats->collisions += p->des01.tx.collision_count;
-		ret = -1;
-	}
-	if (unlikely(p->des01.tx.heartbeat_fail)) {
-		x->tx_heartbeat++;
-		stats->tx_heartbeat_errors++;
-		ret = -1;
-	}
-	if (unlikely(p->des01.tx.deferred))
-		x->tx_deferred++;
-
-	return ret;
-}
-
-static int dwmac100_get_tx_len(struct dma_desc *p)
-{
-	return p->des01.tx.buffer1_size;
-}
-
-/* This function verifies if each incoming frame has some errors
- * and, if required, updates the multicast statistics.
- * In case of success, it returns csum_none becasue the device
- * is not able to compute the csum in HW. */
-static int dwmac100_get_rx_status(void *data, struct stmmac_extra_stats *x,
-				  struct dma_desc *p)
-{
-	int ret = csum_none;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.rx.last_descriptor == 0)) {
-		pr_warning("dwmac100 Error: Oversized Ethernet "
-			   "frame spanned multiple buffers\n");
-		stats->rx_length_errors++;
-		return discard_frame;
-	}
-
-	if (unlikely(p->des01.rx.error_summary)) {
-		if (unlikely(p->des01.rx.descriptor_error))
-			x->rx_desc++;
-		if (unlikely(p->des01.rx.partial_frame_error))
-			x->rx_partial++;
-		if (unlikely(p->des01.rx.run_frame))
-			x->rx_runt++;
-		if (unlikely(p->des01.rx.frame_too_long))
-			x->rx_toolong++;
-		if (unlikely(p->des01.rx.collision)) {
-			x->rx_collision++;
-			stats->collisions++;
-		}
-		if (unlikely(p->des01.rx.crc_error)) {
-			x->rx_crc++;
-			stats->rx_crc_errors++;
-		}
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.rx.dribbling))
-		ret = discard_frame;
-
-	if (unlikely(p->des01.rx.length_error)) {
-		x->rx_length++;
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.rx.mii_error)) {
-		x->rx_mii++;
-		ret = discard_frame;
-	}
-	if (p->des01.rx.multicast_frame) {
-		x->rx_multicast++;
-		stats->multicast++;
-	}
-	return ret;
-}
-
-static void dwmac100_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
-				  int disable_rx_ic)
-{
-	int i;
-	for (i = 0; i < ring_size; i++) {
-		p->des01.rx.own = 1;
-		p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
-		if (i == ring_size - 1)
-			p->des01.rx.end_ring = 1;
-		if (disable_rx_ic)
-			p->des01.rx.disable_ic = 1;
-		p++;
-	}
-	return;
-}
-
-static void dwmac100_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
-{
-	int i;
-	for (i = 0; i < ring_size; i++) {
-		p->des01.tx.own = 0;
-		if (i == ring_size - 1)
-			p->des01.tx.end_ring = 1;
-		p++;
-	}
-	return;
-}
-
-static int dwmac100_get_tx_owner(struct dma_desc *p)
-{
-	return p->des01.tx.own;
-}
-
-static int dwmac100_get_rx_owner(struct dma_desc *p)
-{
-	return p->des01.rx.own;
-}
-
-static void dwmac100_set_tx_owner(struct dma_desc *p)
-{
-	p->des01.tx.own = 1;
-}
-
-static void dwmac100_set_rx_owner(struct dma_desc *p)
-{
-	p->des01.rx.own = 1;
-}
-
-static int dwmac100_get_tx_ls(struct dma_desc *p)
-{
-	return p->des01.tx.last_segment;
-}
-
-static void dwmac100_release_tx_desc(struct dma_desc *p)
-{
-	int ter = p->des01.tx.end_ring;
-
-	/* clean field used within the xmit */
-	p->des01.tx.first_segment = 0;
-	p->des01.tx.last_segment = 0;
-	p->des01.tx.buffer1_size = 0;
-
-	/* clean status reported */
-	p->des01.tx.error_summary = 0;
-	p->des01.tx.underflow_error = 0;
-	p->des01.tx.no_carrier = 0;
-	p->des01.tx.loss_carrier = 0;
-	p->des01.tx.excessive_deferral = 0;
-	p->des01.tx.excessive_collisions = 0;
-	p->des01.tx.late_collision = 0;
-	p->des01.tx.heartbeat_fail = 0;
-	p->des01.tx.deferred = 0;
-
-	/* set termination field */
-	p->des01.tx.end_ring = ter;
-
-	return;
-}
-
-static void dwmac100_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
-				     int csum_flag)
-{
-	p->des01.tx.first_segment = is_fs;
-	p->des01.tx.buffer1_size = len;
-}
-
-static void dwmac100_clear_tx_ic(struct dma_desc *p)
-{
-	p->des01.tx.interrupt = 0;
-}
-
-static void dwmac100_close_tx_desc(struct dma_desc *p)
-{
-	p->des01.tx.last_segment = 1;
-	p->des01.tx.interrupt = 1;
-}
-
-static int dwmac100_get_rx_frame_len(struct dma_desc *p)
-{
-	return p->des01.rx.frame_length;
-}
-
 struct stmmac_dma_ops dwmac100_dma_ops = {
 	.init = dwmac100_dma_init,
 	.dump_regs = dwmac100_dump_dma_regs,
@@ -333,21 +136,3 @@ struct stmmac_dma_ops dwmac100_dma_ops = {
 	.stop_rx = dwmac_dma_stop_rx,
 	.dma_interrupt = dwmac_dma_interrupt,
 };
-
-struct stmmac_desc_ops dwmac100_desc_ops = {
-	.tx_status = dwmac100_get_tx_status,
-	.rx_status = dwmac100_get_rx_status,
-	.get_tx_len = dwmac100_get_tx_len,
-	.init_rx_desc = dwmac100_init_rx_desc,
-	.init_tx_desc = dwmac100_init_tx_desc,
-	.get_tx_owner = dwmac100_get_tx_owner,
-	.get_rx_owner = dwmac100_get_rx_owner,
-	.release_tx_desc = dwmac100_release_tx_desc,
-	.prepare_tx_desc = dwmac100_prepare_tx_desc,
-	.clear_tx_ic = dwmac100_clear_tx_ic,
-	.close_tx_desc = dwmac100_close_tx_desc,
-	.get_tx_ls = dwmac100_get_tx_ls,
-	.set_tx_owner = dwmac100_set_tx_owner,
-	.set_rx_owner = dwmac100_set_rx_owner,
-	.get_rx_frame_len = dwmac100_get_rx_frame_len,
-};
diff --git a/drivers/net/stmmac/enh_desc.c b/drivers/net/stmmac/enh_desc.c
new file mode 100644
index 0000000..e5ac259
--- /dev/null
+++ b/drivers/net/stmmac/enh_desc.c
@@ -0,0 +1,342 @@
+/*******************************************************************************
+  This contains the functions to handle the enhanced descriptors.
+
+  Copyright (C) 2007-2009  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include "common.h"
+
+static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x,
+				  struct dma_desc *p, unsigned long ioaddr)
+{
+	int ret = 0;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.etx.error_summary)) {
+		CHIP_DBG(KERN_ERR "GMAC TX error... 0x%08x\n", p->des01.etx);
+		if (unlikely(p->des01.etx.jabber_timeout)) {
+			CHIP_DBG(KERN_ERR "\tjabber_timeout error\n");
+			x->tx_jabber++;
+		}
+
+		if (unlikely(p->des01.etx.frame_flushed)) {
+			CHIP_DBG(KERN_ERR "\tframe_flushed error\n");
+			x->tx_frame_flushed++;
+			/*enh_desc_flush_tx_fifo(ioaddr);*/
+		}
+
+		if (unlikely(p->des01.etx.loss_carrier)) {
+			CHIP_DBG(KERN_ERR "\tloss_carrier error\n");
+			x->tx_losscarrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely(p->des01.etx.no_carrier)) {
+			CHIP_DBG(KERN_ERR "\tno_carrier error\n");
+			x->tx_carrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely(p->des01.etx.late_collision)) {
+			CHIP_DBG(KERN_ERR "\tlate_collision error\n");
+			stats->collisions += p->des01.etx.collision_count;
+		}
+		if (unlikely(p->des01.etx.excessive_collisions)) {
+			CHIP_DBG(KERN_ERR "\texcessive_collisions\n");
+			stats->collisions += p->des01.etx.collision_count;
+		}
+		if (unlikely(p->des01.etx.excessive_deferral)) {
+			CHIP_DBG(KERN_INFO "\texcessive tx_deferral\n");
+			x->tx_deferred++;
+		}
+
+		if (unlikely(p->des01.etx.underflow_error)) {
+			CHIP_DBG(KERN_ERR "\tunderflow error\n");
+			/*enh_desc_flush_tx_fifo(ioaddr);*/
+			x->tx_underflow++;
+		}
+
+		if (unlikely(p->des01.etx.ip_header_error)) {
+			CHIP_DBG(KERN_ERR "\tTX IP header csum error\n");
+			x->tx_ip_header_error++;
+		}
+
+		if (unlikely(p->des01.etx.payload_error)) {
+			CHIP_DBG(KERN_ERR "\tAddr/Payload csum error\n");
+			x->tx_payload_error++;
+			/*enh_desc_flush_tx_fifo(ioaddr);*/
+		}
+
+		ret = -1;
+	}
+
+	if (unlikely(p->des01.etx.deferred)) {
+		CHIP_DBG(KERN_INFO "GMAC TX status: tx deferred\n");
+		x->tx_deferred++;
+	}
+#ifdef STMMAC_VLAN_TAG_USED
+	if (p->des01.etx.vlan_frame) {
+		CHIP_DBG(KERN_INFO "GMAC TX status: VLAN frame\n");
+		x->tx_vlan++;
+	}
+#endif
+
+	return ret;
+}
+
+static int enh_desc_get_tx_len(struct dma_desc *p)
+{
+	return p->des01.etx.buffer1_size;
+}
+
+static int enh_desc_coe_rdes0(int ipc_err, int type, int payload_err)
+{
+	int ret = good_frame;
+	u32 status = (type << 2 | ipc_err << 1 | payload_err) & 0x7;
+
+	/* bits 5 7 0 | Frame status
+	 * ----------------------------------------------------------
+	 *      0 0 0 | IEEE 802.3 Type frame (length < 1536 octects)
+	 *      1 0 0 | IPv4/6 No CSUM errorS.
+	 *      1 0 1 | IPv4/6 CSUM PAYLOAD error
+	 *      1 1 0 | IPv4/6 CSUM IP HR error
+	 *      1 1 1 | IPv4/6 IP PAYLOAD AND HEADER errorS
+	 *      0 0 1 | IPv4/6 unsupported IP PAYLOAD
+	 *      0 1 1 | COE bypassed.. no IPv4/6 frame
+	 *      0 1 0 | Reserved.
+	 */
+	if (status == 0x0) {
+		CHIP_DBG(KERN_INFO "RX Des0 status: IEEE 802.3 Type frame.\n");
+		ret = good_frame;
+	} else if (status == 0x4) {
+		CHIP_DBG(KERN_INFO "RX Des0 status: IPv4/6 No CSUM errorS.\n");
+		ret = good_frame;
+	} else if (status == 0x5) {
+		CHIP_DBG(KERN_ERR "RX Des0 status: IPv4/6 Payload Error.\n");
+		ret = csum_none;
+	} else if (status == 0x6) {
+		CHIP_DBG(KERN_ERR "RX Des0 status: IPv4/6 Header Error.\n");
+		ret = csum_none;
+	} else if (status == 0x7) {
+		CHIP_DBG(KERN_ERR
+		    "RX Des0 status: IPv4/6 Header and Payload Error.\n");
+		ret = csum_none;
+	} else if (status == 0x1) {
+		CHIP_DBG(KERN_ERR
+		    "RX Des0 status: IPv4/6 unsupported IP PAYLOAD.\n");
+		ret = discard_frame;
+	} else if (status == 0x3) {
+		CHIP_DBG(KERN_ERR "RX Des0 status: No IPv4, IPv6 frame.\n");
+		ret = discard_frame;
+	}
+	return ret;
+}
+
+static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x,
+				  struct dma_desc *p)
+{
+	int ret = good_frame;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.erx.error_summary)) {
+		CHIP_DBG(KERN_ERR "GMAC RX Error Summary 0x%08x\n",
+				  p->des01.erx);
+		if (unlikely(p->des01.erx.descriptor_error)) {
+			CHIP_DBG(KERN_ERR "\tdescriptor error\n");
+			x->rx_desc++;
+			stats->rx_length_errors++;
+		}
+		if (unlikely(p->des01.erx.overflow_error)) {
+			CHIP_DBG(KERN_ERR "\toverflow error\n");
+			x->rx_gmac_overflow++;
+		}
+
+		if (unlikely(p->des01.erx.ipc_csum_error))
+			CHIP_DBG(KERN_ERR "\tIPC Csum Error/Giant frame\n");
+
+		if (unlikely(p->des01.erx.late_collision)) {
+			CHIP_DBG(KERN_ERR "\tlate_collision error\n");
+			stats->collisions++;
+			stats->collisions++;
+		}
+		if (unlikely(p->des01.erx.receive_watchdog)) {
+			CHIP_DBG(KERN_ERR "\treceive_watchdog error\n");
+			x->rx_watchdog++;
+		}
+		if (unlikely(p->des01.erx.error_gmii)) {
+			CHIP_DBG(KERN_ERR "\tReceive Error\n");
+			x->rx_mii++;
+		}
+		if (unlikely(p->des01.erx.crc_error)) {
+			CHIP_DBG(KERN_ERR "\tCRC error\n");
+			x->rx_crc++;
+			stats->rx_crc_errors++;
+		}
+		ret = discard_frame;
+	}
+
+	/* After a payload csum error, the ES bit is set.
+	 * It doesn't match with the information reported into the databook.
+	 * At any rate, we need to understand if the CSUM hw computation is ok
+	 * and report this info to the upper layers. */
+	ret = enh_desc_coe_rdes0(p->des01.erx.ipc_csum_error,
+		p->des01.erx.frame_type, p->des01.erx.payload_csum_error);
+
+	if (unlikely(p->des01.erx.dribbling)) {
+		CHIP_DBG(KERN_ERR "GMAC RX: dribbling error\n");
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.erx.sa_filter_fail)) {
+		CHIP_DBG(KERN_ERR "GMAC RX : Source Address filter fail\n");
+		x->sa_rx_filter_fail++;
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.erx.da_filter_fail)) {
+		CHIP_DBG(KERN_ERR "GMAC RX : Dest Address filter fail\n");
+		x->da_rx_filter_fail++;
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.erx.length_error)) {
+		CHIP_DBG(KERN_ERR "GMAC RX: length_error error\n");
+		x->rx_length++;
+		ret = discard_frame;
+	}
+#ifdef STMMAC_VLAN_TAG_USED
+	if (p->des01.erx.vlan_tag) {
+		CHIP_DBG(KERN_INFO "GMAC RX: VLAN frame tagged\n");
+		x->rx_vlan++;
+	}
+#endif
+	return ret;
+}
+
+static void enh_desc_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
+				  int disable_rx_ic)
+{
+	int i;
+	for (i = 0; i < ring_size; i++) {
+		p->des01.erx.own = 1;
+		p->des01.erx.buffer1_size = BUF_SIZE_8KiB - 1;
+		/* To support jumbo frames */
+		p->des01.erx.buffer2_size = BUF_SIZE_8KiB - 1;
+		if (i == ring_size - 1)
+			p->des01.erx.end_ring = 1;
+		if (disable_rx_ic)
+			p->des01.erx.disable_ic = 1;
+		p++;
+	}
+	return;
+}
+
+static void enh_desc_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
+{
+	int i;
+
+	for (i = 0; i < ring_size; i++) {
+		p->des01.etx.own = 0;
+		if (i == ring_size - 1)
+			p->des01.etx.end_ring = 1;
+		p++;
+	}
+
+	return;
+}
+
+static int enh_desc_get_tx_owner(struct dma_desc *p)
+{
+	return p->des01.etx.own;
+}
+
+static int enh_desc_get_rx_owner(struct dma_desc *p)
+{
+	return p->des01.erx.own;
+}
+
+static void enh_desc_set_tx_owner(struct dma_desc *p)
+{
+	p->des01.etx.own = 1;
+}
+
+static void enh_desc_set_rx_owner(struct dma_desc *p)
+{
+	p->des01.erx.own = 1;
+}
+
+static int enh_desc_get_tx_ls(struct dma_desc *p)
+{
+	return p->des01.etx.last_segment;
+}
+
+static void enh_desc_release_tx_desc(struct dma_desc *p)
+{
+	int ter = p->des01.etx.end_ring;
+
+	memset(p, 0, sizeof(struct dma_desc));
+	p->des01.etx.end_ring = ter;
+
+	return;
+}
+
+static void enh_desc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
+				     int csum_flag)
+{
+	p->des01.etx.first_segment = is_fs;
+	if (unlikely(len > BUF_SIZE_4KiB)) {
+		p->des01.etx.buffer1_size = BUF_SIZE_4KiB;
+		p->des01.etx.buffer2_size = len - BUF_SIZE_4KiB;
+	} else {
+		p->des01.etx.buffer1_size = len;
+	}
+	if (likely(csum_flag))
+		p->des01.etx.checksum_insertion = cic_full;
+}
+
+static void enh_desc_clear_tx_ic(struct dma_desc *p)
+{
+	p->des01.etx.interrupt = 0;
+}
+
+static void enh_desc_close_tx_desc(struct dma_desc *p)
+{
+	p->des01.etx.last_segment = 1;
+	p->des01.etx.interrupt = 1;
+}
+
+static int enh_desc_get_rx_frame_len(struct dma_desc *p)
+{
+	return p->des01.erx.frame_length;
+}
+
+struct stmmac_desc_ops enh_desc_ops = {
+	.tx_status = enh_desc_get_tx_status,
+	.rx_status = enh_desc_get_rx_status,
+	.get_tx_len = enh_desc_get_tx_len,
+	.init_rx_desc = enh_desc_init_rx_desc,
+	.init_tx_desc = enh_desc_init_tx_desc,
+	.get_tx_owner = enh_desc_get_tx_owner,
+	.get_rx_owner = enh_desc_get_rx_owner,
+	.release_tx_desc = enh_desc_release_tx_desc,
+	.prepare_tx_desc = enh_desc_prepare_tx_desc,
+	.clear_tx_ic = enh_desc_clear_tx_ic,
+	.close_tx_desc = enh_desc_close_tx_desc,
+	.get_tx_ls = enh_desc_get_tx_ls,
+	.set_tx_owner = enh_desc_set_tx_owner,
+	.set_rx_owner = enh_desc_set_rx_owner,
+	.get_rx_frame_len = enh_desc_get_rx_frame_len,
+};
diff --git a/drivers/net/stmmac/norm_desc.c b/drivers/net/stmmac/norm_desc.c
new file mode 100644
index 0000000..ecfcc00
--- /dev/null
+++ b/drivers/net/stmmac/norm_desc.c
@@ -0,0 +1,240 @@
+/*******************************************************************************
+  This contains the functions to handle the normal descriptors.
+
+  Copyright (C) 2007-2009  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include "common.h"
+
+static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x,
+			       struct dma_desc *p, unsigned long ioaddr)
+{
+	int ret = 0;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.tx.error_summary)) {
+		if (unlikely(p->des01.tx.underflow_error)) {
+			x->tx_underflow++;
+			stats->tx_fifo_errors++;
+		}
+		if (unlikely(p->des01.tx.no_carrier)) {
+			x->tx_carrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely(p->des01.tx.loss_carrier)) {
+			x->tx_losscarrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely((p->des01.tx.excessive_deferral) ||
+			     (p->des01.tx.excessive_collisions) ||
+			     (p->des01.tx.late_collision)))
+			stats->collisions += p->des01.tx.collision_count;
+		ret = -1;
+	}
+	if (unlikely(p->des01.tx.heartbeat_fail)) {
+		x->tx_heartbeat++;
+		stats->tx_heartbeat_errors++;
+		ret = -1;
+	}
+	if (unlikely(p->des01.tx.deferred))
+		x->tx_deferred++;
+
+	return ret;
+}
+
+static int ndesc_get_tx_len(struct dma_desc *p)
+{
+	return p->des01.tx.buffer1_size;
+}
+
+/* This function verifies if each incoming frame has some errors
+ * and, if required, updates the multicast statistics.
+ * In case of success, it returns csum_none becasue the device
+ * is not able to compute the csum in HW. */
+static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x,
+			       struct dma_desc *p)
+{
+	int ret = csum_none;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.rx.last_descriptor == 0)) {
+		pr_warning("ndesc Error: Oversized Ethernet "
+			   "frame spanned multiple buffers\n");
+		stats->rx_length_errors++;
+		return discard_frame;
+	}
+
+	if (unlikely(p->des01.rx.error_summary)) {
+		if (unlikely(p->des01.rx.descriptor_error))
+			x->rx_desc++;
+		if (unlikely(p->des01.rx.partial_frame_error))
+			x->rx_partial++;
+		if (unlikely(p->des01.rx.run_frame))
+			x->rx_runt++;
+		if (unlikely(p->des01.rx.frame_too_long))
+			x->rx_toolong++;
+		if (unlikely(p->des01.rx.collision)) {
+			x->rx_collision++;
+			stats->collisions++;
+		}
+		if (unlikely(p->des01.rx.crc_error)) {
+			x->rx_crc++;
+			stats->rx_crc_errors++;
+		}
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.rx.dribbling))
+		ret = discard_frame;
+
+	if (unlikely(p->des01.rx.length_error)) {
+		x->rx_length++;
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.rx.mii_error)) {
+		x->rx_mii++;
+		ret = discard_frame;
+	}
+	if (p->des01.rx.multicast_frame) {
+		x->rx_multicast++;
+		stats->multicast++;
+	}
+	return ret;
+}
+
+static void ndesc_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
+			       int disable_rx_ic)
+{
+	int i;
+	for (i = 0; i < ring_size; i++) {
+		p->des01.rx.own = 1;
+		p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
+		if (i == ring_size - 1)
+			p->des01.rx.end_ring = 1;
+		if (disable_rx_ic)
+			p->des01.rx.disable_ic = 1;
+		p++;
+	}
+	return;
+}
+
+static void ndesc_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
+{
+	int i;
+	for (i = 0; i < ring_size; i++) {
+		p->des01.tx.own = 0;
+		if (i == ring_size - 1)
+			p->des01.tx.end_ring = 1;
+		p++;
+	}
+	return;
+}
+
+static int ndesc_get_tx_owner(struct dma_desc *p)
+{
+	return p->des01.tx.own;
+}
+
+static int ndesc_get_rx_owner(struct dma_desc *p)
+{
+	return p->des01.rx.own;
+}
+
+static void ndesc_set_tx_owner(struct dma_desc *p)
+{
+	p->des01.tx.own = 1;
+}
+
+static void ndesc_set_rx_owner(struct dma_desc *p)
+{
+	p->des01.rx.own = 1;
+}
+
+static int ndesc_get_tx_ls(struct dma_desc *p)
+{
+	return p->des01.tx.last_segment;
+}
+
+static void ndesc_release_tx_desc(struct dma_desc *p)
+{
+	int ter = p->des01.tx.end_ring;
+
+	/* clean field used within the xmit */
+	p->des01.tx.first_segment = 0;
+	p->des01.tx.last_segment = 0;
+	p->des01.tx.buffer1_size = 0;
+
+	/* clean status reported */
+	p->des01.tx.error_summary = 0;
+	p->des01.tx.underflow_error = 0;
+	p->des01.tx.no_carrier = 0;
+	p->des01.tx.loss_carrier = 0;
+	p->des01.tx.excessive_deferral = 0;
+	p->des01.tx.excessive_collisions = 0;
+	p->des01.tx.late_collision = 0;
+	p->des01.tx.heartbeat_fail = 0;
+	p->des01.tx.deferred = 0;
+
+	/* set termination field */
+	p->des01.tx.end_ring = ter;
+
+	return;
+}
+
+static void ndesc_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
+				  int csum_flag)
+{
+	p->des01.tx.first_segment = is_fs;
+	p->des01.tx.buffer1_size = len;
+}
+
+static void ndesc_clear_tx_ic(struct dma_desc *p)
+{
+	p->des01.tx.interrupt = 0;
+}
+
+static void ndesc_close_tx_desc(struct dma_desc *p)
+{
+	p->des01.tx.last_segment = 1;
+	p->des01.tx.interrupt = 1;
+}
+
+static int ndesc_get_rx_frame_len(struct dma_desc *p)
+{
+	return p->des01.rx.frame_length;
+}
+
+struct stmmac_desc_ops ndesc_ops = {
+	.tx_status = ndesc_get_tx_status,
+	.rx_status = ndesc_get_rx_status,
+	.get_tx_len = ndesc_get_tx_len,
+	.init_rx_desc = ndesc_init_rx_desc,
+	.init_tx_desc = ndesc_init_tx_desc,
+	.get_tx_owner = ndesc_get_tx_owner,
+	.get_rx_owner = ndesc_get_rx_owner,
+	.release_tx_desc = ndesc_release_tx_desc,
+	.prepare_tx_desc = ndesc_prepare_tx_desc,
+	.clear_tx_ic = ndesc_clear_tx_ic,
+	.close_tx_desc = ndesc_close_tx_desc,
+	.get_tx_ls = ndesc_get_tx_ls,
+	.set_tx_owner = ndesc_set_tx_owner,
+	.set_rx_owner = ndesc_set_rx_owner,
+	.get_rx_frame_len = ndesc_get_rx_frame_len,
+};
diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index ba35e69..55b9aca 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -120,3 +120,5 @@ static inline int stmmac_claim_resource(struct platform_device *pdev)
 extern int stmmac_mdio_unregister(struct net_device *ndev);
 extern int stmmac_mdio_register(struct net_device *ndev);
 extern void stmmac_set_ethtool_ops(struct net_device *netdev);
+extern struct stmmac_desc_ops enh_desc_ops;
+extern struct stmmac_desc_ops ndesc_ops;
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index cc532ef..dafe4dc 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1582,10 +1582,13 @@ static int stmmac_mac_device_setup(struct net_device *dev)
 
 	struct mac_device_info *device;
 
-	if (priv->is_gmac)
+	if (priv->is_gmac) {
 		device = dwmac1000_setup(ioaddr);
-	else
+		device->desc = &enh_desc_ops;
+	} else {
 		device = dwmac100_setup(ioaddr);
+		device->desc = &ndesc_ops;
+	}
 
 	if (!device)
 		return -ENOMEM;
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 1/7] stmmac: split core and dma for the mac10/100
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro

The patch splits core and dma parts for the mac10/100 device.
This was already done for the GMAC device.
It should make more flexible the driver to support other chips.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/Makefile        |    2 +-
 drivers/net/stmmac/dwmac100.c      |  538 ------------------------------------
 drivers/net/stmmac/dwmac100.h      |   17 ++
 drivers/net/stmmac/dwmac100_core.c |  202 ++++++++++++++
 drivers/net/stmmac/dwmac100_dma.c  |  353 +++++++++++++++++++++++
 5 files changed, 573 insertions(+), 539 deletions(-)
 delete mode 100644 drivers/net/stmmac/dwmac100.c
 create mode 100644 drivers/net/stmmac/dwmac100_core.c
 create mode 100644 drivers/net/stmmac/dwmac100_dma.c

diff --git a/drivers/net/stmmac/Makefile b/drivers/net/stmmac/Makefile
index c776af1..b14bd56 100644
--- a/drivers/net/stmmac/Makefile
+++ b/drivers/net/stmmac/Makefile
@@ -2,4 +2,4 @@ obj-$(CONFIG_STMMAC_ETH) += stmmac.o
 stmmac-$(CONFIG_STMMAC_TIMER) += stmmac_timer.o
 stmmac-objs:= stmmac_main.o stmmac_ethtool.o stmmac_mdio.o	\
 	      dwmac_lib.o dwmac1000_core.o  dwmac1000_dma.o	\
-	      dwmac100.o $(stmmac-y)
+	      dwmac100_core.o dwmac100_dma.o $(stmmac-y)
diff --git a/drivers/net/stmmac/dwmac100.c b/drivers/net/stmmac/dwmac100.c
deleted file mode 100644
index 1ca84ea..0000000
--- a/drivers/net/stmmac/dwmac100.c
+++ /dev/null
@@ -1,538 +0,0 @@
-/*******************************************************************************
-  This is the driver for the MAC 10/100 on-chip Ethernet controller
-  currently tested on all the ST boards based on STb7109 and stx7200 SoCs.
-
-  DWC Ether MAC 10/100 Universal version 4.0 has been used for developing
-  this code.
-
-  Copyright (C) 2007-2009  STMicroelectronics Ltd
-
-  This program is free software; you can redistribute it and/or modify it
-  under the terms and conditions of the GNU General Public License,
-  version 2, as published by the Free Software Foundation.
-
-  This program is distributed in the hope it will be useful, but WITHOUT
-  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
-  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
-  more details.
-
-  You should have received a copy of the GNU General Public License along with
-  this program; if not, write to the Free Software Foundation, Inc.,
-  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
-
-  The full GNU General Public License is included in this distribution in
-  the file called "COPYING".
-
-  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-*******************************************************************************/
-
-#include <linux/crc32.h>
-#include <linux/mii.h>
-#include <linux/phy.h>
-#include <linux/slab.h>
-
-#include "common.h"
-#include "dwmac100.h"
-#include "dwmac_dma.h"
-
-#undef DWMAC100_DEBUG
-/*#define DWMAC100_DEBUG*/
-#ifdef DWMAC100_DEBUG
-#define DBG(fmt, args...)  printk(fmt, ## args)
-#else
-#define DBG(fmt, args...)  do { } while (0)
-#endif
-
-static void dwmac100_core_init(unsigned long ioaddr)
-{
-	u32 value = readl(ioaddr + MAC_CONTROL);
-
-	writel((value | MAC_CORE_INIT), ioaddr + MAC_CONTROL);
-
-#ifdef STMMAC_VLAN_TAG_USED
-	writel(ETH_P_8021Q, ioaddr + MAC_VLAN1);
-#endif
-	return;
-}
-
-static void dwmac100_dump_mac_regs(unsigned long ioaddr)
-{
-	pr_info("\t----------------------------------------------\n"
-		"\t  DWMAC 100 CSR (base addr = 0x%8x)\n"
-		"\t----------------------------------------------\n",
-		(unsigned int)ioaddr);
-	pr_info("\tcontrol reg (offset 0x%x): 0x%08x\n", MAC_CONTROL,
-		readl(ioaddr + MAC_CONTROL));
-	pr_info("\taddr HI (offset 0x%x): 0x%08x\n ", MAC_ADDR_HIGH,
-		readl(ioaddr + MAC_ADDR_HIGH));
-	pr_info("\taddr LO (offset 0x%x): 0x%08x\n", MAC_ADDR_LOW,
-		readl(ioaddr + MAC_ADDR_LOW));
-	pr_info("\tmulticast hash HI (offset 0x%x): 0x%08x\n",
-		MAC_HASH_HIGH, readl(ioaddr + MAC_HASH_HIGH));
-	pr_info("\tmulticast hash LO (offset 0x%x): 0x%08x\n",
-		MAC_HASH_LOW, readl(ioaddr + MAC_HASH_LOW));
-	pr_info("\tflow control (offset 0x%x): 0x%08x\n",
-		MAC_FLOW_CTRL, readl(ioaddr + MAC_FLOW_CTRL));
-	pr_info("\tVLAN1 tag (offset 0x%x): 0x%08x\n", MAC_VLAN1,
-		readl(ioaddr + MAC_VLAN1));
-	pr_info("\tVLAN2 tag (offset 0x%x): 0x%08x\n", MAC_VLAN2,
-		readl(ioaddr + MAC_VLAN2));
-	pr_info("\n\tMAC management counter registers\n");
-	pr_info("\t MMC crtl (offset 0x%x): 0x%08x\n",
-		MMC_CONTROL, readl(ioaddr + MMC_CONTROL));
-	pr_info("\t MMC High Interrupt (offset 0x%x): 0x%08x\n",
-		MMC_HIGH_INTR, readl(ioaddr + MMC_HIGH_INTR));
-	pr_info("\t MMC Low Interrupt (offset 0x%x): 0x%08x\n",
-		MMC_LOW_INTR, readl(ioaddr + MMC_LOW_INTR));
-	pr_info("\t MMC High Interrupt Mask (offset 0x%x): 0x%08x\n",
-		MMC_HIGH_INTR_MASK, readl(ioaddr + MMC_HIGH_INTR_MASK));
-	pr_info("\t MMC Low Interrupt Mask (offset 0x%x): 0x%08x\n",
-		MMC_LOW_INTR_MASK, readl(ioaddr + MMC_LOW_INTR_MASK));
-	return;
-}
-
-static int dwmac100_dma_init(unsigned long ioaddr, int pbl, u32 dma_tx,
-			   u32 dma_rx)
-{
-	u32 value = readl(ioaddr + DMA_BUS_MODE);
-	/* DMA SW reset */
-	value |= DMA_BUS_MODE_SFT_RESET;
-	writel(value, ioaddr + DMA_BUS_MODE);
-	do {} while ((readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET));
-
-	/* Enable Application Access by writing to DMA CSR0 */
-	writel(DMA_BUS_MODE_DEFAULT | (pbl << DMA_BUS_MODE_PBL_SHIFT),
-	       ioaddr + DMA_BUS_MODE);
-
-	/* Mask interrupts by writing to CSR7 */
-	writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
-
-	/* The base address of the RX/TX descriptor lists must be written into
-	 * DMA CSR3 and CSR4, respectively. */
-	writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
-	writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
-
-	return 0;
-}
-
-/* Store and Forward capability is not used at all..
- * The transmit threshold can be programmed by
- * setting the TTC bits in the DMA control register.*/
-static void dwmac100_dma_operation_mode(unsigned long ioaddr, int txmode,
-				      int rxmode)
-{
-	u32 csr6 = readl(ioaddr + DMA_CONTROL);
-
-	if (txmode <= 32)
-		csr6 |= DMA_CONTROL_TTC_32;
-	else if (txmode <= 64)
-		csr6 |= DMA_CONTROL_TTC_64;
-	else
-		csr6 |= DMA_CONTROL_TTC_128;
-
-	writel(csr6, ioaddr + DMA_CONTROL);
-
-	return;
-}
-
-static void dwmac100_dump_dma_regs(unsigned long ioaddr)
-{
-	int i;
-
-	DBG(KERN_DEBUG "DWMAC 100 DMA CSR\n");
-	for (i = 0; i < 9; i++)
-		pr_debug("\t CSR%d (offset 0x%x): 0x%08x\n", i,
-		       (DMA_BUS_MODE + i * 4),
-		       readl(ioaddr + DMA_BUS_MODE + i * 4));
-	DBG(KERN_DEBUG "\t CSR20 (offset 0x%x): 0x%08x\n",
-	    DMA_CUR_TX_BUF_ADDR, readl(ioaddr + DMA_CUR_TX_BUF_ADDR));
-	DBG(KERN_DEBUG "\t CSR21 (offset 0x%x): 0x%08x\n",
-	    DMA_CUR_RX_BUF_ADDR, readl(ioaddr + DMA_CUR_RX_BUF_ADDR));
-	return;
-}
-
-/* DMA controller has two counters to track the number of
- * the receive missed frames. */
-static void dwmac100_dma_diagnostic_fr(void *data,
-				     struct stmmac_extra_stats *x,
-				     unsigned long ioaddr)
-{
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-	u32 csr8 = readl(ioaddr + DMA_MISSED_FRAME_CTR);
-
-	if (unlikely(csr8)) {
-		if (csr8 & DMA_MISSED_FRAME_OVE) {
-			stats->rx_over_errors += 0x800;
-			x->rx_overflow_cntr += 0x800;
-		} else {
-			unsigned int ove_cntr;
-			ove_cntr = ((csr8 & DMA_MISSED_FRAME_OVE_CNTR) >> 17);
-			stats->rx_over_errors += ove_cntr;
-			x->rx_overflow_cntr += ove_cntr;
-		}
-
-		if (csr8 & DMA_MISSED_FRAME_OVE_M) {
-			stats->rx_missed_errors += 0xffff;
-			x->rx_missed_cntr += 0xffff;
-		} else {
-			unsigned int miss_f = (csr8 & DMA_MISSED_FRAME_M_CNTR);
-			stats->rx_missed_errors += miss_f;
-			x->rx_missed_cntr += miss_f;
-		}
-	}
-	return;
-}
-
-static int dwmac100_get_tx_frame_status(void *data,
-				      struct stmmac_extra_stats *x,
-				      struct dma_desc *p, unsigned long ioaddr)
-{
-	int ret = 0;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.tx.error_summary)) {
-		if (unlikely(p->des01.tx.underflow_error)) {
-			x->tx_underflow++;
-			stats->tx_fifo_errors++;
-		}
-		if (unlikely(p->des01.tx.no_carrier)) {
-			x->tx_carrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely(p->des01.tx.loss_carrier)) {
-			x->tx_losscarrier++;
-			stats->tx_carrier_errors++;
-		}
-		if (unlikely((p->des01.tx.excessive_deferral) ||
-			     (p->des01.tx.excessive_collisions) ||
-			     (p->des01.tx.late_collision)))
-			stats->collisions += p->des01.tx.collision_count;
-		ret = -1;
-	}
-	if (unlikely(p->des01.tx.heartbeat_fail)) {
-		x->tx_heartbeat++;
-		stats->tx_heartbeat_errors++;
-		ret = -1;
-	}
-	if (unlikely(p->des01.tx.deferred))
-		x->tx_deferred++;
-
-	return ret;
-}
-
-static int dwmac100_get_tx_len(struct dma_desc *p)
-{
-	return p->des01.tx.buffer1_size;
-}
-
-/* This function verifies if each incoming frame has some errors
- * and, if required, updates the multicast statistics.
- * In case of success, it returns csum_none becasue the device
- * is not able to compute the csum in HW. */
-static int dwmac100_get_rx_frame_status(void *data,
-				      struct stmmac_extra_stats *x,
-				      struct dma_desc *p)
-{
-	int ret = csum_none;
-	struct net_device_stats *stats = (struct net_device_stats *)data;
-
-	if (unlikely(p->des01.rx.last_descriptor == 0)) {
-		pr_warning("dwmac100 Error: Oversized Ethernet "
-			   "frame spanned multiple buffers\n");
-		stats->rx_length_errors++;
-		return discard_frame;
-	}
-
-	if (unlikely(p->des01.rx.error_summary)) {
-		if (unlikely(p->des01.rx.descriptor_error))
-			x->rx_desc++;
-		if (unlikely(p->des01.rx.partial_frame_error))
-			x->rx_partial++;
-		if (unlikely(p->des01.rx.run_frame))
-			x->rx_runt++;
-		if (unlikely(p->des01.rx.frame_too_long))
-			x->rx_toolong++;
-		if (unlikely(p->des01.rx.collision)) {
-			x->rx_collision++;
-			stats->collisions++;
-		}
-		if (unlikely(p->des01.rx.crc_error)) {
-			x->rx_crc++;
-			stats->rx_crc_errors++;
-		}
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.rx.dribbling))
-		ret = discard_frame;
-
-	if (unlikely(p->des01.rx.length_error)) {
-		x->rx_length++;
-		ret = discard_frame;
-	}
-	if (unlikely(p->des01.rx.mii_error)) {
-		x->rx_mii++;
-		ret = discard_frame;
-	}
-	if (p->des01.rx.multicast_frame) {
-		x->rx_multicast++;
-		stats->multicast++;
-	}
-	return ret;
-}
-
-static void dwmac100_irq_status(unsigned long ioaddr)
-{
-	return;
-}
-
-static void dwmac100_set_umac_addr(unsigned long ioaddr, unsigned char *addr,
-			  unsigned int reg_n)
-{
-	stmmac_set_mac_addr(ioaddr, addr, MAC_ADDR_HIGH, MAC_ADDR_LOW);
-}
-
-static void dwmac100_get_umac_addr(unsigned long ioaddr, unsigned char *addr,
-			  unsigned int reg_n)
-{
-	stmmac_get_mac_addr(ioaddr, addr, MAC_ADDR_HIGH, MAC_ADDR_LOW);
-}
-
-static void dwmac100_set_filter(struct net_device *dev)
-{
-	unsigned long ioaddr = dev->base_addr;
-	u32 value = readl(ioaddr + MAC_CONTROL);
-
-	if (dev->flags & IFF_PROMISC) {
-		value |= MAC_CONTROL_PR;
-		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_IF | MAC_CONTROL_HO |
-			   MAC_CONTROL_HP);
-	} else if ((netdev_mc_count(dev) > HASH_TABLE_SIZE)
-		   || (dev->flags & IFF_ALLMULTI)) {
-		value |= MAC_CONTROL_PM;
-		value &= ~(MAC_CONTROL_PR | MAC_CONTROL_IF | MAC_CONTROL_HO);
-		writel(0xffffffff, ioaddr + MAC_HASH_HIGH);
-		writel(0xffffffff, ioaddr + MAC_HASH_LOW);
-	} else if (netdev_mc_empty(dev)) {	/* no multicast */
-		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_PR | MAC_CONTROL_IF |
-			   MAC_CONTROL_HO | MAC_CONTROL_HP);
-	} else {
-		u32 mc_filter[2];
-		struct netdev_hw_addr *ha;
-
-		/* Perfect filter mode for physical address and Hash
-		   filter for multicast */
-		value |= MAC_CONTROL_HP;
-		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_PR |
-			   MAC_CONTROL_IF | MAC_CONTROL_HO);
-
-		memset(mc_filter, 0, sizeof(mc_filter));
-		netdev_for_each_mc_addr(ha, dev) {
-			/* The upper 6 bits of the calculated CRC are used to
-			 * index the contens of the hash table */
-			int bit_nr =
-			    ether_crc(ETH_ALEN, ha->addr) >> 26;
-			/* The most significant bit determines the register to
-			 * use (H/L) while the other 5 bits determine the bit
-			 * within the register. */
-			mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 31);
-		}
-		writel(mc_filter[0], ioaddr + MAC_HASH_LOW);
-		writel(mc_filter[1], ioaddr + MAC_HASH_HIGH);
-	}
-
-	writel(value, ioaddr + MAC_CONTROL);
-
-	DBG(KERN_INFO "%s: CTRL reg: 0x%08x Hash regs: "
-	    "HI 0x%08x, LO 0x%08x\n",
-	    __func__, readl(ioaddr + MAC_CONTROL),
-	    readl(ioaddr + MAC_HASH_HIGH), readl(ioaddr + MAC_HASH_LOW));
-	return;
-}
-
-static void dwmac100_flow_ctrl(unsigned long ioaddr, unsigned int duplex,
-			     unsigned int fc, unsigned int pause_time)
-{
-	unsigned int flow = MAC_FLOW_CTRL_ENABLE;
-
-	if (duplex)
-		flow |= (pause_time << MAC_FLOW_CTRL_PT_SHIFT);
-	writel(flow, ioaddr + MAC_FLOW_CTRL);
-
-	return;
-}
-
-/* No PMT module supported for this Ethernet Controller.
- * Tested on ST platforms only.
- */
-static void dwmac100_pmt(unsigned long ioaddr, unsigned long mode)
-{
-	return;
-}
-
-static void dwmac100_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
-				int disable_rx_ic)
-{
-	int i;
-	for (i = 0; i < ring_size; i++) {
-		p->des01.rx.own = 1;
-		p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
-		if (i == ring_size - 1)
-			p->des01.rx.end_ring = 1;
-		if (disable_rx_ic)
-			p->des01.rx.disable_ic = 1;
-		p++;
-	}
-	return;
-}
-
-static void dwmac100_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
-{
-	int i;
-	for (i = 0; i < ring_size; i++) {
-		p->des01.tx.own = 0;
-		if (i == ring_size - 1)
-			p->des01.tx.end_ring = 1;
-		p++;
-	}
-	return;
-}
-
-static int dwmac100_get_tx_owner(struct dma_desc *p)
-{
-	return p->des01.tx.own;
-}
-
-static int dwmac100_get_rx_owner(struct dma_desc *p)
-{
-	return p->des01.rx.own;
-}
-
-static void dwmac100_set_tx_owner(struct dma_desc *p)
-{
-	p->des01.tx.own = 1;
-}
-
-static void dwmac100_set_rx_owner(struct dma_desc *p)
-{
-	p->des01.rx.own = 1;
-}
-
-static int dwmac100_get_tx_ls(struct dma_desc *p)
-{
-	return p->des01.tx.last_segment;
-}
-
-static void dwmac100_release_tx_desc(struct dma_desc *p)
-{
-	int ter = p->des01.tx.end_ring;
-
-	/* clean field used within the xmit */
-	p->des01.tx.first_segment = 0;
-	p->des01.tx.last_segment = 0;
-	p->des01.tx.buffer1_size = 0;
-
-	/* clean status reported */
-	p->des01.tx.error_summary = 0;
-	p->des01.tx.underflow_error = 0;
-	p->des01.tx.no_carrier = 0;
-	p->des01.tx.loss_carrier = 0;
-	p->des01.tx.excessive_deferral = 0;
-	p->des01.tx.excessive_collisions = 0;
-	p->des01.tx.late_collision = 0;
-	p->des01.tx.heartbeat_fail = 0;
-	p->des01.tx.deferred = 0;
-
-	/* set termination field */
-	p->des01.tx.end_ring = ter;
-
-	return;
-}
-
-static void dwmac100_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
-				   int csum_flag)
-{
-	p->des01.tx.first_segment = is_fs;
-	p->des01.tx.buffer1_size = len;
-}
-
-static void dwmac100_clear_tx_ic(struct dma_desc *p)
-{
-	p->des01.tx.interrupt = 0;
-}
-
-static void dwmac100_close_tx_desc(struct dma_desc *p)
-{
-	p->des01.tx.last_segment = 1;
-	p->des01.tx.interrupt = 1;
-}
-
-static int dwmac100_get_rx_frame_len(struct dma_desc *p)
-{
-	return p->des01.rx.frame_length;
-}
-
-struct stmmac_ops dwmac100_ops = {
-	.core_init = dwmac100_core_init,
-	.dump_regs = dwmac100_dump_mac_regs,
-	.host_irq_status = dwmac100_irq_status,
-	.set_filter = dwmac100_set_filter,
-	.flow_ctrl = dwmac100_flow_ctrl,
-	.pmt = dwmac100_pmt,
-	.set_umac_addr = dwmac100_set_umac_addr,
-	.get_umac_addr = dwmac100_get_umac_addr,
-};
-
-struct stmmac_dma_ops dwmac100_dma_ops = {
-	.init = dwmac100_dma_init,
-	.dump_regs = dwmac100_dump_dma_regs,
-	.dma_mode = dwmac100_dma_operation_mode,
-	.dma_diagnostic_fr = dwmac100_dma_diagnostic_fr,
-	.enable_dma_transmission = dwmac_enable_dma_transmission,
-	.enable_dma_irq = dwmac_enable_dma_irq,
-	.disable_dma_irq = dwmac_disable_dma_irq,
-	.start_tx = dwmac_dma_start_tx,
-	.stop_tx = dwmac_dma_stop_tx,
-	.start_rx = dwmac_dma_start_rx,
-	.stop_rx = dwmac_dma_stop_rx,
-	.dma_interrupt = dwmac_dma_interrupt,
-};
-
-struct stmmac_desc_ops dwmac100_desc_ops = {
-	.tx_status = dwmac100_get_tx_frame_status,
-	.rx_status = dwmac100_get_rx_frame_status,
-	.get_tx_len = dwmac100_get_tx_len,
-	.init_rx_desc = dwmac100_init_rx_desc,
-	.init_tx_desc = dwmac100_init_tx_desc,
-	.get_tx_owner = dwmac100_get_tx_owner,
-	.get_rx_owner = dwmac100_get_rx_owner,
-	.release_tx_desc = dwmac100_release_tx_desc,
-	.prepare_tx_desc = dwmac100_prepare_tx_desc,
-	.clear_tx_ic = dwmac100_clear_tx_ic,
-	.close_tx_desc = dwmac100_close_tx_desc,
-	.get_tx_ls = dwmac100_get_tx_ls,
-	.set_tx_owner = dwmac100_set_tx_owner,
-	.set_rx_owner = dwmac100_set_rx_owner,
-	.get_rx_frame_len = dwmac100_get_rx_frame_len,
-};
-
-struct mac_device_info *dwmac100_setup(unsigned long ioaddr)
-{
-	struct mac_device_info *mac;
-
-	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
-
-	pr_info("\tDWMAC100\n");
-
-	mac->mac = &dwmac100_ops;
-	mac->desc = &dwmac100_desc_ops;
-	mac->dma = &dwmac100_dma_ops;
-
-	mac->pmt = PMT_NOT_SUPPORTED;
-	mac->link.port = MAC_CONTROL_PS;
-	mac->link.duplex = MAC_CONTROL_F;
-	mac->link.speed = 0;
-	mac->mii.addr = MAC_MII_ADDR;
-	mac->mii.data = MAC_MII_DATA;
-
-	return mac;
-}
diff --git a/drivers/net/stmmac/dwmac100.h b/drivers/net/stmmac/dwmac100.h
index 0f8f110..9f4ba2e 100644
--- a/drivers/net/stmmac/dwmac100.h
+++ b/drivers/net/stmmac/dwmac100.h
@@ -22,6 +22,9 @@
   Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
 *******************************************************************************/
 
+#include <linux/phy.h>
+#include "common.h"
+
 /*----------------------------------------------------------------------------
  *	 			MAC BLOCK defines
  *---------------------------------------------------------------------------*/
@@ -114,3 +117,17 @@ enum ttc_control {
 #define DMA_MISSED_FRAME_OVE_CNTR 0x0ffe0000	/* Overflow Frame Counter */
 #define DMA_MISSED_FRAME_OVE_M	0x00010000	/* Missed Frame Overflow */
 #define DMA_MISSED_FRAME_M_CNTR	0x0000ffff	/* Missed Frame Couinter */
+
+#undef DWMAC100_DEBUG
+/* #define DWMAC100__DEBUG */
+#undef FRAME_FILTER_DEBUG
+/* #define FRAME_FILTER_DEBUG */
+#ifdef DWMAC100__DEBUG
+#define DBG(fmt, args...)  printk(fmt, ## args)
+#else
+#define DBG(fmt, args...)  do { } while (0)
+#endif
+
+extern struct stmmac_dma_ops dwmac100_dma_ops;
+extern struct stmmac_desc_ops dwmac100_desc_ops;
+
diff --git a/drivers/net/stmmac/dwmac100_core.c b/drivers/net/stmmac/dwmac100_core.c
new file mode 100644
index 0000000..8ecb8c0
--- /dev/null
+++ b/drivers/net/stmmac/dwmac100_core.c
@@ -0,0 +1,202 @@
+/*******************************************************************************
+  This is the driver for the MAC 10/100 on-chip Ethernet controller
+  currently tested on all the ST boards based on STb7109 and stx7200 SoCs.
+
+  DWC Ether MAC 10/100 Universal version 4.0 has been used for developing
+  this code.
+
+  This only implements the mac core functions for this chip.
+
+  Copyright (C) 2007-2009  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include <linux/crc32.h>
+#include "dwmac100.h"
+
+static void dwmac100_core_init(unsigned long ioaddr)
+{
+	u32 value = readl(ioaddr + MAC_CONTROL);
+
+	writel((value | MAC_CORE_INIT), ioaddr + MAC_CONTROL);
+
+#ifdef STMMAC_VLAN_TAG_USED
+	writel(ETH_P_8021Q, ioaddr + MAC_VLAN1);
+#endif
+	return;
+}
+
+static void dwmac100_dump_mac_regs(unsigned long ioaddr)
+{
+	pr_info("\t----------------------------------------------\n"
+		"\t  DWMAC 100 CSR (base addr = 0x%8x)\n"
+		"\t----------------------------------------------\n",
+		(unsigned int)ioaddr);
+	pr_info("\tcontrol reg (offset 0x%x): 0x%08x\n", MAC_CONTROL,
+		readl(ioaddr + MAC_CONTROL));
+	pr_info("\taddr HI (offset 0x%x): 0x%08x\n ", MAC_ADDR_HIGH,
+		readl(ioaddr + MAC_ADDR_HIGH));
+	pr_info("\taddr LO (offset 0x%x): 0x%08x\n", MAC_ADDR_LOW,
+		readl(ioaddr + MAC_ADDR_LOW));
+	pr_info("\tmulticast hash HI (offset 0x%x): 0x%08x\n",
+		MAC_HASH_HIGH, readl(ioaddr + MAC_HASH_HIGH));
+	pr_info("\tmulticast hash LO (offset 0x%x): 0x%08x\n",
+		MAC_HASH_LOW, readl(ioaddr + MAC_HASH_LOW));
+	pr_info("\tflow control (offset 0x%x): 0x%08x\n",
+		MAC_FLOW_CTRL, readl(ioaddr + MAC_FLOW_CTRL));
+	pr_info("\tVLAN1 tag (offset 0x%x): 0x%08x\n", MAC_VLAN1,
+		readl(ioaddr + MAC_VLAN1));
+	pr_info("\tVLAN2 tag (offset 0x%x): 0x%08x\n", MAC_VLAN2,
+		readl(ioaddr + MAC_VLAN2));
+	pr_info("\n\tMAC management counter registers\n");
+	pr_info("\t MMC crtl (offset 0x%x): 0x%08x\n",
+		MMC_CONTROL, readl(ioaddr + MMC_CONTROL));
+	pr_info("\t MMC High Interrupt (offset 0x%x): 0x%08x\n",
+		MMC_HIGH_INTR, readl(ioaddr + MMC_HIGH_INTR));
+	pr_info("\t MMC Low Interrupt (offset 0x%x): 0x%08x\n",
+		MMC_LOW_INTR, readl(ioaddr + MMC_LOW_INTR));
+	pr_info("\t MMC High Interrupt Mask (offset 0x%x): 0x%08x\n",
+		MMC_HIGH_INTR_MASK, readl(ioaddr + MMC_HIGH_INTR_MASK));
+	pr_info("\t MMC Low Interrupt Mask (offset 0x%x): 0x%08x\n",
+		MMC_LOW_INTR_MASK, readl(ioaddr + MMC_LOW_INTR_MASK));
+	return;
+}
+
+static void dwmac100_irq_status(unsigned long ioaddr)
+{
+	return;
+}
+
+static void dwmac100_set_umac_addr(unsigned long ioaddr, unsigned char *addr,
+				   unsigned int reg_n)
+{
+	stmmac_set_mac_addr(ioaddr, addr, MAC_ADDR_HIGH, MAC_ADDR_LOW);
+}
+
+static void dwmac100_get_umac_addr(unsigned long ioaddr, unsigned char *addr,
+				   unsigned int reg_n)
+{
+	stmmac_get_mac_addr(ioaddr, addr, MAC_ADDR_HIGH, MAC_ADDR_LOW);
+}
+
+static void dwmac100_set_filter(struct net_device *dev)
+{
+	unsigned long ioaddr = dev->base_addr;
+	u32 value = readl(ioaddr + MAC_CONTROL);
+
+	if (dev->flags & IFF_PROMISC) {
+		value |= MAC_CONTROL_PR;
+		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_IF | MAC_CONTROL_HO |
+			   MAC_CONTROL_HP);
+	} else if ((netdev_mc_count(dev) > HASH_TABLE_SIZE)
+		   || (dev->flags & IFF_ALLMULTI)) {
+		value |= MAC_CONTROL_PM;
+		value &= ~(MAC_CONTROL_PR | MAC_CONTROL_IF | MAC_CONTROL_HO);
+		writel(0xffffffff, ioaddr + MAC_HASH_HIGH);
+		writel(0xffffffff, ioaddr + MAC_HASH_LOW);
+	} else if (netdev_mc_empty(dev)) {	/* no multicast */
+		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_PR | MAC_CONTROL_IF |
+			   MAC_CONTROL_HO | MAC_CONTROL_HP);
+	} else {
+		u32 mc_filter[2];
+		struct netdev_hw_addr *ha;
+
+		/* Perfect filter mode for physical address and Hash
+		   filter for multicast */
+		value |= MAC_CONTROL_HP;
+		value &= ~(MAC_CONTROL_PM | MAC_CONTROL_PR |
+			   MAC_CONTROL_IF | MAC_CONTROL_HO);
+
+		memset(mc_filter, 0, sizeof(mc_filter));
+		netdev_for_each_mc_addr(ha, dev) {
+			/* The upper 6 bits of the calculated CRC are used to
+			 * index the contens of the hash table */
+			int bit_nr =
+			    ether_crc(ETH_ALEN, ha->addr) >> 26;
+			/* The most significant bit determines the register to
+			 * use (H/L) while the other 5 bits determine the bit
+			 * within the register. */
+			mc_filter[bit_nr >> 5] |= 1 << (bit_nr & 31);
+		}
+		writel(mc_filter[0], ioaddr + MAC_HASH_LOW);
+		writel(mc_filter[1], ioaddr + MAC_HASH_HIGH);
+	}
+
+	writel(value, ioaddr + MAC_CONTROL);
+
+	DBG(KERN_INFO "%s: CTRL reg: 0x%08x Hash regs: "
+	    "HI 0x%08x, LO 0x%08x\n",
+	    __func__, readl(ioaddr + MAC_CONTROL),
+	    readl(ioaddr + MAC_HASH_HIGH), readl(ioaddr + MAC_HASH_LOW));
+	return;
+}
+
+static void dwmac100_flow_ctrl(unsigned long ioaddr, unsigned int duplex,
+			       unsigned int fc, unsigned int pause_time)
+{
+	unsigned int flow = MAC_FLOW_CTRL_ENABLE;
+
+	if (duplex)
+		flow |= (pause_time << MAC_FLOW_CTRL_PT_SHIFT);
+	writel(flow, ioaddr + MAC_FLOW_CTRL);
+
+	return;
+}
+
+/* No PMT module supported for this Ethernet Controller.
+ * Tested on ST platforms only.
+ */
+static void dwmac100_pmt(unsigned long ioaddr, unsigned long mode)
+{
+	return;
+}
+
+struct stmmac_ops dwmac100_ops = {
+	.core_init = dwmac100_core_init,
+	.dump_regs = dwmac100_dump_mac_regs,
+	.host_irq_status = dwmac100_irq_status,
+	.set_filter = dwmac100_set_filter,
+	.flow_ctrl = dwmac100_flow_ctrl,
+	.pmt = dwmac100_pmt,
+	.set_umac_addr = dwmac100_set_umac_addr,
+	.get_umac_addr = dwmac100_get_umac_addr,
+};
+
+struct mac_device_info *dwmac100_setup(unsigned long ioaddr)
+{
+	struct mac_device_info *mac;
+
+	mac = kzalloc(sizeof(const struct mac_device_info), GFP_KERNEL);
+
+	pr_info("\tDWMAC100\n");
+
+	mac->mac = &dwmac100_ops;
+	mac->desc = &dwmac100_desc_ops;
+	mac->dma = &dwmac100_dma_ops;
+
+	mac->pmt = PMT_NOT_SUPPORTED;
+	mac->link.port = MAC_CONTROL_PS;
+	mac->link.duplex = MAC_CONTROL_F;
+	mac->link.speed = 0;
+	mac->mii.addr = MAC_MII_ADDR;
+	mac->mii.data = MAC_MII_DATA;
+
+	return mac;
+}
diff --git a/drivers/net/stmmac/dwmac100_dma.c b/drivers/net/stmmac/dwmac100_dma.c
new file mode 100644
index 0000000..7fcc526
--- /dev/null
+++ b/drivers/net/stmmac/dwmac100_dma.c
@@ -0,0 +1,353 @@
+/*******************************************************************************
+  This is the driver for the MAC 10/100 on-chip Ethernet controller
+  currently tested on all the ST boards based on STb7109 and stx7200 SoCs.
+
+  DWC Ether MAC 10/100 Universal version 4.0 has been used for developing
+  this code.
+
+  This contains the functions to handle the dma and descriptors.
+
+  Copyright (C) 2007-2009  STMicroelectronics Ltd
+
+  This program is free software; you can redistribute it and/or modify it
+  under the terms and conditions of the GNU General Public License,
+  version 2, as published by the Free Software Foundation.
+
+  This program is distributed in the hope it will be useful, but WITHOUT
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+  more details.
+
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc.,
+  51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+
+  The full GNU General Public License is included in this distribution in
+  the file called "COPYING".
+
+  Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+*******************************************************************************/
+
+#include "dwmac100.h"
+#include "dwmac_dma.h"
+
+static int dwmac100_dma_init(unsigned long ioaddr, int pbl, u32 dma_tx,
+			     u32 dma_rx)
+{
+	u32 value = readl(ioaddr + DMA_BUS_MODE);
+	/* DMA SW reset */
+	value |= DMA_BUS_MODE_SFT_RESET;
+	writel(value, ioaddr + DMA_BUS_MODE);
+	do {} while ((readl(ioaddr + DMA_BUS_MODE) & DMA_BUS_MODE_SFT_RESET));
+
+	/* Enable Application Access by writing to DMA CSR0 */
+	writel(DMA_BUS_MODE_DEFAULT | (pbl << DMA_BUS_MODE_PBL_SHIFT),
+	       ioaddr + DMA_BUS_MODE);
+
+	/* Mask interrupts by writing to CSR7 */
+	writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA);
+
+	/* The base address of the RX/TX descriptor lists must be written into
+	 * DMA CSR3 and CSR4, respectively. */
+	writel(dma_tx, ioaddr + DMA_TX_BASE_ADDR);
+	writel(dma_rx, ioaddr + DMA_RCV_BASE_ADDR);
+
+	return 0;
+}
+
+/* Store and Forward capability is not used at all..
+ * The transmit threshold can be programmed by
+ * setting the TTC bits in the DMA control register.*/
+static void dwmac100_dma_operation_mode(unsigned long ioaddr, int txmode,
+					int rxmode)
+{
+	u32 csr6 = readl(ioaddr + DMA_CONTROL);
+
+	if (txmode <= 32)
+		csr6 |= DMA_CONTROL_TTC_32;
+	else if (txmode <= 64)
+		csr6 |= DMA_CONTROL_TTC_64;
+	else
+		csr6 |= DMA_CONTROL_TTC_128;
+
+	writel(csr6, ioaddr + DMA_CONTROL);
+
+	return;
+}
+
+static void dwmac100_dump_dma_regs(unsigned long ioaddr)
+{
+	int i;
+
+	DBG(KERN_DEBUG "DWMAC 100 DMA CSR\n");
+	for (i = 0; i < 9; i++)
+		pr_debug("\t CSR%d (offset 0x%x): 0x%08x\n", i,
+		       (DMA_BUS_MODE + i * 4),
+		       readl(ioaddr + DMA_BUS_MODE + i * 4));
+	DBG(KERN_DEBUG "\t CSR20 (offset 0x%x): 0x%08x\n",
+	    DMA_CUR_TX_BUF_ADDR, readl(ioaddr + DMA_CUR_TX_BUF_ADDR));
+	DBG(KERN_DEBUG "\t CSR21 (offset 0x%x): 0x%08x\n",
+	    DMA_CUR_RX_BUF_ADDR, readl(ioaddr + DMA_CUR_RX_BUF_ADDR));
+	return;
+}
+
+/* DMA controller has two counters to track the number of
+ * the receive missed frames. */
+static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x,
+				       unsigned long ioaddr)
+{
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+	u32 csr8 = readl(ioaddr + DMA_MISSED_FRAME_CTR);
+
+	if (unlikely(csr8)) {
+		if (csr8 & DMA_MISSED_FRAME_OVE) {
+			stats->rx_over_errors += 0x800;
+			x->rx_overflow_cntr += 0x800;
+		} else {
+			unsigned int ove_cntr;
+			ove_cntr = ((csr8 & DMA_MISSED_FRAME_OVE_CNTR) >> 17);
+			stats->rx_over_errors += ove_cntr;
+			x->rx_overflow_cntr += ove_cntr;
+		}
+
+		if (csr8 & DMA_MISSED_FRAME_OVE_M) {
+			stats->rx_missed_errors += 0xffff;
+			x->rx_missed_cntr += 0xffff;
+		} else {
+			unsigned int miss_f = (csr8 & DMA_MISSED_FRAME_M_CNTR);
+			stats->rx_missed_errors += miss_f;
+			x->rx_missed_cntr += miss_f;
+		}
+	}
+	return;
+}
+
+static int dwmac100_get_tx_status(void *data, struct stmmac_extra_stats *x,
+				  struct dma_desc *p, unsigned long ioaddr)
+{
+	int ret = 0;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.tx.error_summary)) {
+		if (unlikely(p->des01.tx.underflow_error)) {
+			x->tx_underflow++;
+			stats->tx_fifo_errors++;
+		}
+		if (unlikely(p->des01.tx.no_carrier)) {
+			x->tx_carrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely(p->des01.tx.loss_carrier)) {
+			x->tx_losscarrier++;
+			stats->tx_carrier_errors++;
+		}
+		if (unlikely((p->des01.tx.excessive_deferral) ||
+			     (p->des01.tx.excessive_collisions) ||
+			     (p->des01.tx.late_collision)))
+			stats->collisions += p->des01.tx.collision_count;
+		ret = -1;
+	}
+	if (unlikely(p->des01.tx.heartbeat_fail)) {
+		x->tx_heartbeat++;
+		stats->tx_heartbeat_errors++;
+		ret = -1;
+	}
+	if (unlikely(p->des01.tx.deferred))
+		x->tx_deferred++;
+
+	return ret;
+}
+
+static int dwmac100_get_tx_len(struct dma_desc *p)
+{
+	return p->des01.tx.buffer1_size;
+}
+
+/* This function verifies if each incoming frame has some errors
+ * and, if required, updates the multicast statistics.
+ * In case of success, it returns csum_none becasue the device
+ * is not able to compute the csum in HW. */
+static int dwmac100_get_rx_status(void *data, struct stmmac_extra_stats *x,
+				  struct dma_desc *p)
+{
+	int ret = csum_none;
+	struct net_device_stats *stats = (struct net_device_stats *)data;
+
+	if (unlikely(p->des01.rx.last_descriptor == 0)) {
+		pr_warning("dwmac100 Error: Oversized Ethernet "
+			   "frame spanned multiple buffers\n");
+		stats->rx_length_errors++;
+		return discard_frame;
+	}
+
+	if (unlikely(p->des01.rx.error_summary)) {
+		if (unlikely(p->des01.rx.descriptor_error))
+			x->rx_desc++;
+		if (unlikely(p->des01.rx.partial_frame_error))
+			x->rx_partial++;
+		if (unlikely(p->des01.rx.run_frame))
+			x->rx_runt++;
+		if (unlikely(p->des01.rx.frame_too_long))
+			x->rx_toolong++;
+		if (unlikely(p->des01.rx.collision)) {
+			x->rx_collision++;
+			stats->collisions++;
+		}
+		if (unlikely(p->des01.rx.crc_error)) {
+			x->rx_crc++;
+			stats->rx_crc_errors++;
+		}
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.rx.dribbling))
+		ret = discard_frame;
+
+	if (unlikely(p->des01.rx.length_error)) {
+		x->rx_length++;
+		ret = discard_frame;
+	}
+	if (unlikely(p->des01.rx.mii_error)) {
+		x->rx_mii++;
+		ret = discard_frame;
+	}
+	if (p->des01.rx.multicast_frame) {
+		x->rx_multicast++;
+		stats->multicast++;
+	}
+	return ret;
+}
+
+static void dwmac100_init_rx_desc(struct dma_desc *p, unsigned int ring_size,
+				  int disable_rx_ic)
+{
+	int i;
+	for (i = 0; i < ring_size; i++) {
+		p->des01.rx.own = 1;
+		p->des01.rx.buffer1_size = BUF_SIZE_2KiB - 1;
+		if (i == ring_size - 1)
+			p->des01.rx.end_ring = 1;
+		if (disable_rx_ic)
+			p->des01.rx.disable_ic = 1;
+		p++;
+	}
+	return;
+}
+
+static void dwmac100_init_tx_desc(struct dma_desc *p, unsigned int ring_size)
+{
+	int i;
+	for (i = 0; i < ring_size; i++) {
+		p->des01.tx.own = 0;
+		if (i == ring_size - 1)
+			p->des01.tx.end_ring = 1;
+		p++;
+	}
+	return;
+}
+
+static int dwmac100_get_tx_owner(struct dma_desc *p)
+{
+	return p->des01.tx.own;
+}
+
+static int dwmac100_get_rx_owner(struct dma_desc *p)
+{
+	return p->des01.rx.own;
+}
+
+static void dwmac100_set_tx_owner(struct dma_desc *p)
+{
+	p->des01.tx.own = 1;
+}
+
+static void dwmac100_set_rx_owner(struct dma_desc *p)
+{
+	p->des01.rx.own = 1;
+}
+
+static int dwmac100_get_tx_ls(struct dma_desc *p)
+{
+	return p->des01.tx.last_segment;
+}
+
+static void dwmac100_release_tx_desc(struct dma_desc *p)
+{
+	int ter = p->des01.tx.end_ring;
+
+	/* clean field used within the xmit */
+	p->des01.tx.first_segment = 0;
+	p->des01.tx.last_segment = 0;
+	p->des01.tx.buffer1_size = 0;
+
+	/* clean status reported */
+	p->des01.tx.error_summary = 0;
+	p->des01.tx.underflow_error = 0;
+	p->des01.tx.no_carrier = 0;
+	p->des01.tx.loss_carrier = 0;
+	p->des01.tx.excessive_deferral = 0;
+	p->des01.tx.excessive_collisions = 0;
+	p->des01.tx.late_collision = 0;
+	p->des01.tx.heartbeat_fail = 0;
+	p->des01.tx.deferred = 0;
+
+	/* set termination field */
+	p->des01.tx.end_ring = ter;
+
+	return;
+}
+
+static void dwmac100_prepare_tx_desc(struct dma_desc *p, int is_fs, int len,
+				     int csum_flag)
+{
+	p->des01.tx.first_segment = is_fs;
+	p->des01.tx.buffer1_size = len;
+}
+
+static void dwmac100_clear_tx_ic(struct dma_desc *p)
+{
+	p->des01.tx.interrupt = 0;
+}
+
+static void dwmac100_close_tx_desc(struct dma_desc *p)
+{
+	p->des01.tx.last_segment = 1;
+	p->des01.tx.interrupt = 1;
+}
+
+static int dwmac100_get_rx_frame_len(struct dma_desc *p)
+{
+	return p->des01.rx.frame_length;
+}
+
+struct stmmac_dma_ops dwmac100_dma_ops = {
+	.init = dwmac100_dma_init,
+	.dump_regs = dwmac100_dump_dma_regs,
+	.dma_mode = dwmac100_dma_operation_mode,
+	.dma_diagnostic_fr = dwmac100_dma_diagnostic_fr,
+	.enable_dma_transmission = dwmac_enable_dma_transmission,
+	.enable_dma_irq = dwmac_enable_dma_irq,
+	.disable_dma_irq = dwmac_disable_dma_irq,
+	.start_tx = dwmac_dma_start_tx,
+	.stop_tx = dwmac_dma_stop_tx,
+	.start_rx = dwmac_dma_start_rx,
+	.stop_rx = dwmac_dma_stop_rx,
+	.dma_interrupt = dwmac_dma_interrupt,
+};
+
+struct stmmac_desc_ops dwmac100_desc_ops = {
+	.tx_status = dwmac100_get_tx_status,
+	.rx_status = dwmac100_get_rx_status,
+	.get_tx_len = dwmac100_get_tx_len,
+	.init_rx_desc = dwmac100_init_rx_desc,
+	.init_tx_desc = dwmac100_init_tx_desc,
+	.get_tx_owner = dwmac100_get_tx_owner,
+	.get_rx_owner = dwmac100_get_rx_owner,
+	.release_tx_desc = dwmac100_release_tx_desc,
+	.prepare_tx_desc = dwmac100_prepare_tx_desc,
+	.clear_tx_ic = dwmac100_clear_tx_ic,
+	.close_tx_desc = dwmac100_close_tx_desc,
+	.get_tx_ls = dwmac100_get_tx_ls,
+	.set_tx_owner = dwmac100_set_tx_owner,
+	.set_rx_owner = dwmac100_set_rx_owner,
+	.get_rx_frame_len = dwmac100_get_rx_frame_len,
+};
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 5/7] stmmac: get the descriptor structure from platform
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-4-git-send-email-peppe.cavallaro@st.com>

Output for chip that uses the Enhanced descriptors:
[snip]
STMMAC driver:
	platform registration... done!
	DWMAC1000 - user ID: 0x10, Synopsys ID: 0x33
	Enhanced descriptor structure
	no valid MAC address;please, use ifconfig or nwhwconfig!
	eth0 - (dev. name: stmmaceth - id: 0, IRQ #134
	IO base addr: 0xfd110000)
STMMAC MII Bus: probed
[snip]

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/stmmac.h      |    1 +
 drivers/net/stmmac/stmmac_main.c |   12 ++++++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index 55b9aca..0d776bc 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -93,6 +93,7 @@ struct stmmac_priv {
 #ifdef STMMAC_VLAN_TAG_USED
 	struct vlan_group *vlgrp;
 #endif
+	int enh_desc;
 };
 
 #ifdef CONFIG_STM_DRIVERS
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index dafe4dc..78d1a8b 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1582,13 +1582,16 @@ static int stmmac_mac_device_setup(struct net_device *dev)
 
 	struct mac_device_info *device;
 
-	if (priv->is_gmac) {
+	if (priv->is_gmac)
 		device = dwmac1000_setup(ioaddr);
-		device->desc = &enh_desc_ops;
-	} else {
+	else
 		device = dwmac100_setup(ioaddr);
+
+	if (priv->enh_desc) {
+		device->desc = &enh_desc_ops;
+		pr_info("\tEnhanced descriptor structure\n");
+	} else
 		device->desc = &ndesc_ops;
-	}
 
 	if (!device)
 		return -ENOMEM;
@@ -1730,6 +1733,7 @@ static int stmmac_dvr_probe(struct platform_device *pdev)
 	priv->bus_id = plat_dat->bus_id;
 	priv->pbl = plat_dat->pbl;	/* TLI */
 	priv->is_gmac = plat_dat->has_gmac;	/* GMAC is on board */
+	priv->enh_desc = plat_dat->enh_desc;
 
 	platform_set_drvdata(pdev, ndev);
 
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 6/7] stmmac: fix vlan support setup
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-5-git-send-email-peppe.cavallaro@st.com>

Moved STMMAC_VLAN_TAG_USED from stmmac.h to common.h header
because it is used within the device and descriptor cores.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/common.h |    5 +++++
 drivers/net/stmmac/stmmac.h |    5 -----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/stmmac/common.h b/drivers/net/stmmac/common.h
index 27a05b4..144f76f 100644
--- a/drivers/net/stmmac/common.h
+++ b/drivers/net/stmmac/common.h
@@ -23,6 +23,11 @@
 *******************************************************************************/
 
 #include <linux/netdevice.h>
+#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
+#define STMMAC_VLAN_TAG_USED
+#include <linux/if_vlan.h>
+#endif
+
 #include "descs.h"
 
 #undef CHIP_DEBUG_PRINT
diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index 0d776bc..1a6eb7b 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -23,11 +23,6 @@
 #define DRV_MODULE_VERSION	"Jan_2010"
 #include <linux/stmmac.h>
 
-#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
-#define STMMAC_VLAN_TAG_USED
-#include <linux/if_vlan.h>
-#endif
-
 #include "common.h"
 #ifdef CONFIG_STMMAC_TIMER
 #include "stmmac_timer.h"
-- 
1.6.0.4


^ permalink raw reply related

* [net-next 7/7] stmmac: updated the drv module version
From: Giuseppe CAVALLARO @ 2010-04-14  6:21 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1271226077-25882-6-git-send-email-peppe.cavallaro@st.com>

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/stmmac/stmmac.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/stmmac/stmmac.h b/drivers/net/stmmac/stmmac.h
index 1a6eb7b..ebebc64 100644
--- a/drivers/net/stmmac/stmmac.h
+++ b/drivers/net/stmmac/stmmac.h
@@ -20,7 +20,7 @@
   Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
 *******************************************************************************/
 
-#define DRV_MODULE_VERSION	"Jan_2010"
+#define DRV_MODULE_VERSION	"Apr_2010"
 #include <linux/stmmac.h>
 
 #include "common.h"
-- 
1.6.0.4


^ permalink raw reply related

* Re: [PATCH] fix potential wild pointer when NIC is dying
From: Changli Gao @ 2010-04-14  7:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Tom Herbert, Herbert Xu, netdev
In-Reply-To: <1271223212.16881.598.camel@edumazet-laptop>

On Wed, Apr 14, 2010 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mercredi 14 avril 2010 à 20:18 +0800, Changli Gao a écrit :
>
> I dont see how the problem can happens, and how RPS is involved.
>
> Did you got a single panic, could you provide us a stack trace ?
>
> Maybe are you referring to NAPI ?
>
> NAPI process packets delivered by NIC, and through RPS deliver it to a
> (possibly) remote CPU queue.
>
> But at device dismantle time, we should stop NAPI on this device and
> packet delivery machinery. RPS being on or not, NAPI wont deliver new
> packets. The fact that NAPI can be throtled doesnt change the napi
> instance being disabled at this point. No more packet will be delivered
> (RPS or not)
>
> Only after this point we call flush_backlog() to make sure we dont have
> any queued packet in each cpu input_pkt_queue pointing to the device we
> dismantle.
>
> RPS doesnt change this at all.
>
> Hmm ???
>

Thanks, I got it.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* [PATCH net-next-2.6] fasync: RCU locking
From: Eric Dumazet @ 2010-04-14  7:42 UTC (permalink / raw)
  To: David Miller, Paul E. McKenney; +Cc: netdev, linux-kernel

Paul, could you please check this patch, I am not sure
of the IRQ safety thing...

Is call_rcu() the right method to use in this case ?

Thanks

[PATCH net-next-2.6] fasync: RCU locking

kill_fasync() uses a central rwlock, candidate for RCU conversion.

We can remove __kill_fasync() direct use in net, and rename it to
kill_fasync_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 fs/fcntl.c         |   36 +++++++++++++++++++++---------------
 include/linux/fs.h |   11 +++++------
 net/socket.c       |    4 ++--
 3 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 452d02f..33cb3ee 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -614,9 +614,15 @@ int send_sigurg(struct fown_struct *fown)
 	return ret;
 }
 
-static DEFINE_RWLOCK(fasync_lock);
+static DEFINE_SPINLOCK(fasync_lock);
 static struct kmem_cache *fasync_cache __read_mostly;
 
+static void fasync_free_rcu(struct rcu_head *head)
+{
+	kmem_cache_free(fasync_cache,
+			container_of(head, struct fasync_struct, fa_rcu));
+}
+
 /*
  * Remove a fasync entry. If successfully removed, return
  * positive and clear the FASYNC flag. If no entry exists,
@@ -634,17 +640,17 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
 	int result = 0;
 
 	spin_lock(&filp->f_lock);
-	write_lock_irq(&fasync_lock);
+	spin_lock_irq(&fasync_lock);
 	for (fp = fapp; (fa = *fp) != NULL; fp = &fa->fa_next) {
 		if (fa->fa_file != filp)
 			continue;
 		*fp = fa->fa_next;
-		kmem_cache_free(fasync_cache, fa);
+		call_rcu(&fa->fa_rcu, fasync_free_rcu);
 		filp->f_flags &= ~FASYNC;
 		result = 1;
 		break;
 	}
-	write_unlock_irq(&fasync_lock);
+	spin_unlock_irq(&fasync_lock);
 	spin_unlock(&filp->f_lock);
 	return result;
 }
@@ -666,7 +672,7 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 		return -ENOMEM;
 
 	spin_lock(&filp->f_lock);
-	write_lock_irq(&fasync_lock);
+	spin_lock_irq(&fasync_lock);
 	for (fp = fapp; (fa = *fp) != NULL; fp = &fa->fa_next) {
 		if (fa->fa_file != filp)
 			continue;
@@ -679,12 +685,12 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 	new->fa_file = filp;
 	new->fa_fd = fd;
 	new->fa_next = *fapp;
-	*fapp = new;
+	rcu_assign_pointer(*fapp, new);
 	result = 1;
 	filp->f_flags |= FASYNC;
 
 out:
-	write_unlock_irq(&fasync_lock);
+	spin_unlock_irq(&fasync_lock);
 	spin_unlock(&filp->f_lock);
 	return result;
 }
@@ -704,7 +710,10 @@ int fasync_helper(int fd, struct file * filp, int on, struct fasync_struct **fap
 
 EXPORT_SYMBOL(fasync_helper);
 
-void __kill_fasync(struct fasync_struct *fa, int sig, int band)
+/*
+ * rcu_read_lock() is held
+ */
+static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
 {
 	while (fa) {
 		struct fown_struct * fown;
@@ -719,22 +728,19 @@ void __kill_fasync(struct fasync_struct *fa, int sig, int band)
 		   mechanism. */
 		if (!(sig == SIGURG && fown->signum == 0))
 			send_sigio(fown, fa->fa_fd, band);
-		fa = fa->fa_next;
+		fa = rcu_dereference(fa->fa_next);
 	}
 }
 
-EXPORT_SYMBOL(__kill_fasync);
-
 void kill_fasync(struct fasync_struct **fp, int sig, int band)
 {
 	/* First a quick test without locking: usually
 	 * the list is empty.
 	 */
 	if (*fp) {
-		read_lock(&fasync_lock);
-		/* reread *fp after obtaining the lock */
-		__kill_fasync(*fp, sig, band);
-		read_unlock(&fasync_lock);
+		rcu_read_lock();
+		kill_fasync_rcu(rcu_dereference(*fp), sig, band);
+		rcu_read_unlock();
 	}
 }
 EXPORT_SYMBOL(kill_fasync);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 39d57bc..158b2cc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1280,10 +1280,11 @@ static inline int lock_may_write(struct inode *inode, loff_t start,
 
 
 struct fasync_struct {
-	int	magic;
-	int	fa_fd;
-	struct	fasync_struct	*fa_next; /* singly linked list */
-	struct	file 		*fa_file;
+	int			magic;
+	int			fa_fd;
+	struct fasync_struct	*fa_next; /* singly linked list */
+	struct file		*fa_file;
+	struct rcu_head		fa_rcu;
 };
 
 #define FASYNC_MAGIC 0x4601
@@ -1292,8 +1293,6 @@ struct fasync_struct {
 extern int fasync_helper(int, struct file *, int, struct fasync_struct **);
 /* can be called from interrupts */
 extern void kill_fasync(struct fasync_struct **, int, int);
-/* only for net: no internal synchronization */
-extern void __kill_fasync(struct fasync_struct *, int, int);
 
 extern int __f_setown(struct file *filp, struct pid *, enum pid_type, int force);
 extern int f_setown(struct file *filp, unsigned long arg, int force);
diff --git a/net/socket.c b/net/socket.c
index 35bc198..846739c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1159,10 +1159,10 @@ int sock_wake_async(struct socket *sock, int how, int band)
 		/* fall through */
 	case SOCK_WAKE_IO:
 call_kill:
-		__kill_fasync(sock->fasync_list, SIGIO, band);
+		kill_fasync(sock->fasync_list, SIGIO, band);
 		break;
 	case SOCK_WAKE_URG:
-		__kill_fasync(sock->fasync_list, SIGURG, band);
+		kill_fasync(sock->fasync_list, SIGURG, band);
 	}
 	return 0;
 }

^ permalink raw reply related

* Re: [PATCH] fix potential wild pointer when NIC is dying
From: Eric Dumazet @ 2010-04-14  7:49 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, Tom Herbert, Herbert Xu, netdev
In-Reply-To: <u2r412e6f7f1004140025i51e533c9t7402bc751dd925c2@mail.gmail.com>

Le mercredi 14 avril 2010 à 15:25 +0800, Changli Gao a écrit :

> 
> Thanks, I got it.
> 

No problem, its better to double check anyway :)




^ permalink raw reply

* [PATCH v2] RPS: export internal software RX queues via sysfs
From: Changli Gao @ 2010-04-14  7:57 UTC (permalink / raw)
  To: David S. Miller; +Cc: Tom Herbert, Eric Dumazet, netdev, Changli Gao

export internal software RX queues via sysfs.

The RPS software RX queues are exported as
/sys/class/net/$nic/queues/rx-$/sw-rx-$, and you can specify which CPU handles
a special queue by writing the CPU id to the corresponding file sw-rx-$.
The number of software RX queues can be specified by writing
/sys/class/net/$nic/queues/rx-$/nr-sw-rx. nr-sw-rx is 0 by default.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/core/net-sysfs.c |  234 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 230 insertions(+), 4 deletions(-)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 96ed690..4a547b7 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -512,6 +512,167 @@ static struct sysfs_ops rx_queue_sysfs_ops = {
 	.store = rx_queue_attr_store,
 };
 
+static DEFINE_MUTEX(rps_map_lock);
+
+static ssize_t show_sw_rx(struct netdev_rx_queue *queue,
+			     struct rx_queue_attribute *attribute, char *buf)
+{
+	unsigned long id;
+	struct rps_map *map;
+	u16 cpu;
+
+	strict_strtoul(attribute->attr.name + strlen("sw-rx-"), 10, &id);
+	rcu_read_lock();
+	map = rcu_dereference(queue->rps_map);
+	if (map && id < map->len)
+		cpu = map->cpus[id];
+	else
+		cpu = 0;
+	rcu_read_unlock();
+	return sprintf(buf, "%hu\n", cpu);
+}
+
+static ssize_t store_sw_rx(struct netdev_rx_queue *queue,
+			      struct rx_queue_attribute *attribute,
+			      const char *buf, size_t len)
+{
+	unsigned long id, cpu;
+	struct rps_map *map;
+
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (strict_strtoul(buf, 0, &cpu) || cpu >= nr_cpumask_bits)
+		return -EINVAL;
+	strict_strtoul(attribute->attr.name + strlen("sw-rx-"), 10, &id);
+
+	mutex_lock(&rps_map_lock);
+	map = queue->rps_map;
+	if (map && id < map->len)
+		map->cpus[id] = cpu;
+	mutex_unlock(&rps_map_lock);
+
+	return len;
+}
+
+struct sw_rx_attribute {
+	struct rx_queue_attribute	qattr;
+	atomic_t			ref;
+};
+
+static inline void sw_rx_attribute_free(struct sw_rx_attribute *attr)
+{
+	kfree(attr->qattr.attr.name);
+	kfree(attr);
+}
+
+static struct sw_rx_attribute **sw_rx_attr;
+static int sw_rx_attr_size;
+
+#define SW_RX_MAX 65535
+
+static void shrink_sw_rx_attr(void)
+{
+	struct sw_rx_attribute **attrs;
+
+	if (sw_rx_attr_size == 0) {
+		kfree(sw_rx_attr);
+		sw_rx_attr = NULL;
+		return;
+	}
+
+	attrs = kmalloc(sw_rx_attr_size * sizeof(void *), GFP_KERNEL);
+	if (attrs == NULL)
+		return;
+	memcpy(attrs, sw_rx_attr, sw_rx_attr_size * sizeof(void *));
+	swap(attrs, sw_rx_attr);
+	kfree(attrs);
+}
+
+/* must be called with rps_map_lock locked */
+static int update_sw_rx_files(struct kobject *kobj,
+				 struct rps_map *old_map, struct rps_map *map)
+{
+	int i;
+	int old_map_len = old_map ? old_map->len : 0;
+	int map_len = map ? map->len : 0;
+
+	if (old_map_len >= map_len) {
+		bool shrink = false;
+
+		for (i = old_map_len - 1; i >= map_len; i--) {
+			sysfs_remove_file(kobj, &sw_rx_attr[i]->qattr.attr);
+			if (atomic_dec_and_test(&sw_rx_attr[i]->ref)) {
+				sw_rx_attribute_free(sw_rx_attr[i]);
+				sw_rx_attr_size--;
+				shrink = true;
+			}
+
+		}
+
+		if (shrink)
+			shrink_sw_rx_attr();
+
+		return 0;
+	}
+
+	if (map_len > sw_rx_attr_size) {
+		struct sw_rx_attribute **attrs;
+		char name[sizeof("sw-rx-" __stringify(SW_RX_MAX))];
+		char *pname;
+
+		attrs = krealloc(sw_rx_attr, map_len * sizeof(void *),
+				 GFP_KERNEL);
+		if (attrs == NULL)
+			return -ENOMEM;
+		sw_rx_attr = attrs;
+		for (i = sw_rx_attr_size; i < map_len; i++) {
+			sw_rx_attr[i] = kmalloc(sizeof(**attrs), GFP_KERNEL);
+			if (sw_rx_attr[i] == NULL)
+				break;
+			sprintf(name, "sw-rx-%d", i);
+			pname = kstrdup(name, GFP_KERNEL);
+			if (pname == NULL) {
+				kfree(sw_rx_attr[i]);
+				break;
+			}
+			sw_rx_attr[i]->qattr.attr.name = pname;
+			sw_rx_attr[i]->qattr.attr.mode = S_IRUGO | S_IWUSR;
+			sw_rx_attr[i]->qattr.show = show_sw_rx;
+			sw_rx_attr[i]->qattr.store = store_sw_rx;
+			atomic_set(&sw_rx_attr[i]->ref, 0);
+		}
+		if (i != map_len) {
+			while (--i >= sw_rx_attr_size)
+				sw_rx_attribute_free(sw_rx_attr[i]);
+			shrink_sw_rx_attr();
+			return -ENOMEM;
+		}
+	}
+
+	for (i = old_map_len; i < map_len; i++) {
+		atomic_inc(&sw_rx_attr[i]->ref);
+		if (sysfs_create_file(kobj, &sw_rx_attr[i]->qattr.attr) == 0)
+			continue;
+		atomic_dec(&sw_rx_attr[i]->ref);
+		while (--i >= old_map_len) {
+			sysfs_remove_file(kobj, &sw_rx_attr[i]->qattr.attr);
+			atomic_dec(&sw_rx_attr[i]->ref);
+		}
+		if (sw_rx_attr_size < map_len) {
+			for (i = sw_rx_attr_size; i < map_len; i++)
+				sw_rx_attribute_free(sw_rx_attr[i]);
+			shrink_sw_rx_attr();
+		}
+		return -ENOMEM;
+	}
+
+	if (sw_rx_attr_size < map_len)
+		sw_rx_attr_size = map_len;
+
+	return 0;
+}
+
 static ssize_t show_rps_map(struct netdev_rx_queue *queue,
 			    struct rx_queue_attribute *attribute, char *buf)
 {
@@ -556,7 +717,6 @@ ssize_t store_rps_map(struct netdev_rx_queue *queue,
 	struct rps_map *old_map, *map;
 	cpumask_var_t mask;
 	int err, cpu, i;
-	static DEFINE_SPINLOCK(rps_map_lock);
 
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
@@ -589,10 +749,15 @@ ssize_t store_rps_map(struct netdev_rx_queue *queue,
 		map = NULL;
 	}
 
-	spin_lock(&rps_map_lock);
+	mutex_lock(&rps_map_lock);
 	old_map = queue->rps_map;
-	rcu_assign_pointer(queue->rps_map, map);
-	spin_unlock(&rps_map_lock);
+	err = update_sw_rx_files(&queue->kobj, old_map, map);
+	if (!err)
+		rcu_assign_pointer(queue->rps_map, map);
+	mutex_unlock(&rps_map_lock);
+
+	if (err)
+		return err;
 
 	if (old_map)
 		call_rcu(&old_map->rcu, rps_map_release);
@@ -604,8 +769,69 @@ ssize_t store_rps_map(struct netdev_rx_queue *queue,
 static struct rx_queue_attribute rps_cpus_attribute =
 	__ATTR(rps_cpus, S_IRUGO | S_IWUSR, show_rps_map, store_rps_map);
 
+static ssize_t show_nr_sw_rx(struct netdev_rx_queue *queue,
+			      struct rx_queue_attribute *attribute, char *buf)
+{
+	struct rps_map *map;
+	unsigned int len;
+
+	rcu_read_lock();
+	map = rcu_dereference(queue->rps_map);
+	len = map ? map->len : 0;
+	rcu_read_unlock();
+	return sprintf(buf, "%u\n", len);
+}
+
+static ssize_t store_nr_sw_rx(struct netdev_rx_queue *queue,
+			      struct rx_queue_attribute *attribute,
+			      const char *buf, size_t len)
+{
+	struct rps_map *old_map, *map;
+	unsigned long nr;
+	int err;
+
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (strict_strtoul(buf, 0, &nr) || nr > SW_RX_MAX + 1)
+		return -EINVAL;
+	if (nr != 0) {
+		map = kzalloc(max_t(unsigned, RPS_MAP_SIZE(nr), L1_CACHE_BYTES),
+			      GFP_KERNEL);
+		if (map == NULL)
+			return -ENOMEM;
+		map->len = nr;
+	} else {
+		map = NULL;
+	}
+
+	mutex_lock(&rps_map_lock);
+	old_map = queue->rps_map;
+	err = update_sw_rx_files(&queue->kobj, old_map, map);
+	if (!err) {
+		if (old_map && map)
+			memcpy(map->cpus, old_map->cpus,
+			       sizeof(map->cpus[0]) *
+			       min_t(unsigned int, nr, old_map->len));
+		rcu_assign_pointer(queue->rps_map, map);
+	}
+	mutex_unlock(&rps_map_lock);
+
+	if (err)
+		return err;
+
+	if (old_map)
+		call_rcu(&old_map->rcu, rps_map_release);
+
+	return len;
+}
+
+static struct rx_queue_attribute nr_sw_rx_attribute =
+	__ATTR(nr-sw-rx, S_IRUGO | S_IWUSR, show_nr_sw_rx, store_nr_sw_rx);
+
 static struct attribute *rx_queue_default_attrs[] = {
 	&rps_cpus_attribute.attr,
+	&nr_sw_rx_attribute.attr,
 	NULL
 };
 

^ permalink raw reply related

* Re: [Bonding-devel] [v3 Patch 2/3] bridge: make bridge support netpoll
From: Cong Wang @ 2010-04-14  8:11 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Stephen Hemminger, Eric Dumazet, Neil Horman, netdev,
	Andy Gospodarek, bridge, linux-kernel, bonding-devel, Jeff Moyer,
	Matt Mackall, David Miller
In-Reply-To: <8304.1271177567@death.nxdomain.ibm.com>

Jay Vosburgh wrote:
> Cong Wang <amwang@redhat.com> wrote:
> 
>> Stephen Hemminger wrote:
>>> On Mon, 12 Apr 2010 12:38:57 +0200
>>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>
>>>> Le lundi 12 avril 2010 à 18:37 +0800, Cong Wang a écrit :
>>>>> Stephen Hemminger wrote:
>>>>>> There is no protection on dev->priv_flags for SMP access.
>>>>>> It would better bit value in dev->state if you are using it as control flag.
>>>>>>
>>>>>> Then you could use 
>>>>>> 			if (unlikely(test_and_clear_bit(__IN_NETPOLL, &skb->dev->state)))
>>>>>> 				netpoll_send_skb(...)
>>>>>>
>>>>>>
>>>>> Hmm, I think we can't use ->state here, it is not for this kind of purpose,
>>>>> according to its comments.
>>>>>
>>>>> Also, I find other usages of IFF_XXX flags of ->priv_flags are also using
>>>>> &, | to set or clear the flags. So there must be some other things preventing
>>>>> the race...
>>>> Yes, its RTNL that protects priv_flags changes, hopefully...
>>>>
>>> The patch was not protecting priv_flags with RTNL.
>>> For example..
>>>
>>>
>>> @@ -308,7 +312,9 @@ static void netpoll_send_skb(struct netp
>>>  		     tries > 0; --tries) {
>>>  			if (__netif_tx_trylock(txq)) {
>>>  				if (!netif_tx_queue_stopped(txq)) {
>>> +					dev->priv_flags |= IFF_IN_NETPOLL;
>>>  					status = ops->ndo_start_xmit(skb, dev);
>>> +					dev->priv_flags &= ~IFF_IN_NETPOLL;
>>>  					if (status == NETDEV_TX_OK)
>>>  						txq_trans_update(txq);
>> Hmm, but I checked the bonding case (IFF_BONDING), it doesn't
>> hold rtnl_lock. Strange.
> 
> 	I looked, and there are a couple of cases in bonding that don't
> have RTNL for adjusting priv_flags (in bond_ab_arp_probe when no slaves
> are up, and a couple of cases in 802.3ad).  I think the solution there
> is to move bonding away from priv_flags for some of this (e.g., convert
> bonding to use a frame hook like bridge and macvlan, and greatly
> simplify skb_bond_should_drop), but that's a separate topic.
> 
> 	The majority of the cases, however, do hold RTNL.  Bonding
> generally doesn't have to acquire RTNL itself, since whatever called
> into bonding is holding it already.  For example, the slave add and
> remove paths (bond_enslave, bond_release) are called either via sysfs or
> ioctl, both of which acquire RTNL.  All of the set and clear operations
> for IFF_BONDING fall into this category; look at bonding_store_slaves
> for an example.
> 
> 	Bonding does acquire RTNL itself when performing failovers,
> e.g., bond_mii_monitor holds RTNL prior to calling bond_miimon_commit,
> which will change priv_flags.
> 

Thanks a lot for your reply!

You are right, I missed something.

Hmm, for bonding, RTNL lock is necessary because there are sysfs
interface and ioctl interface to change its configuration.

^ permalink raw reply

* Re: [Bonding-devel] [v3 Patch 2/3] bridge: make bridge support netpoll
From: Cong Wang @ 2010-04-14  8:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Jay Vosburgh, Eric Dumazet, Neil Horman, netdev, Andy Gospodarek,
	bridge, linux-kernel, bonding-devel, Jeff Moyer, Matt Mackall,
	David Miller
In-Reply-To: <20100413103320.11a2a4f7@nehalam>

Stephen Hemminger wrote:
> On Tue, 13 Apr 2010 09:52:47 -0700
> Jay Vosburgh <fubar@us.ibm.com> wrote:
> 
>> Cong Wang <amwang@redhat.com> wrote:
>>
>>> Stephen Hemminger wrote:
>>>> On Mon, 12 Apr 2010 12:38:57 +0200
>>>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>
>>>>> Le lundi 12 avril 2010 à 18:37 +0800, Cong Wang a écrit :
>>>>>> Stephen Hemminger wrote:
>>>>>>> There is no protection on dev->priv_flags for SMP access.
>>>>>>> It would better bit value in dev->state if you are using it as control flag.
>>>>>>>
>>>>>>> Then you could use 
>>>>>>> 			if (unlikely(test_and_clear_bit(__IN_NETPOLL, &skb->dev->state)))
>>>>>>> 				netpoll_send_skb(...)
>>>>>>>
>>>>>>>
>>>>>> Hmm, I think we can't use ->state here, it is not for this kind of purpose,
>>>>>> according to its comments.
>>>>>>
>>>>>> Also, I find other usages of IFF_XXX flags of ->priv_flags are also using
>>>>>> &, | to set or clear the flags. So there must be some other things preventing
>>>>>> the race...
>>>>> Yes, its RTNL that protects priv_flags changes, hopefully...
>>>>>
>>>> The patch was not protecting priv_flags with RTNL.
>>>> For example..
>>>>
>>>>
>>>> @@ -308,7 +312,9 @@ static void netpoll_send_skb(struct netp
>>>>  		     tries > 0; --tries) {
>>>>  			if (__netif_tx_trylock(txq)) {
>>>>  				if (!netif_tx_queue_stopped(txq)) {
>>>> +					dev->priv_flags |= IFF_IN_NETPOLL;
>>>>  					status = ops->ndo_start_xmit(skb, dev);
>>>> +					dev->priv_flags &= ~IFF_IN_NETPOLL;
>>>>  					if (status == NETDEV_TX_OK)
>>>>  						txq_trans_update(txq);
>>> Hmm, but I checked the bonding case (IFF_BONDING), it doesn't
>>> hold rtnl_lock. Strange.
>> 	I looked, and there are a couple of cases in bonding that don't
>> have RTNL for adjusting priv_flags (in bond_ab_arp_probe when no slaves
>> are up, and a couple of cases in 802.3ad).  I think the solution there
>> is to move bonding away from priv_flags for some of this (e.g., convert
>> bonding to use a frame hook like bridge and macvlan, and greatly
>> simplify skb_bond_should_drop), but that's a separate topic.
>>
>> 	The majority of the cases, however, do hold RTNL.  Bonding
>> generally doesn't have to acquire RTNL itself, since whatever called
>> into bonding is holding it already.  For example, the slave add and
>> remove paths (bond_enslave, bond_release) are called either via sysfs or
>> ioctl, both of which acquire RTNL.  All of the set and clear operations
>> for IFF_BONDING fall into this category; look at bonding_store_slaves
>> for an example.
>>
>> 	Bonding does acquire RTNL itself when performing failovers,
>> e.g., bond_mii_monitor holds RTNL prior to calling bond_miimon_commit,
>> which will change priv_flags.
>>
> 
> All this was related to netpoll. And netpoll processing often needs to occur
> in hard IRQ context. Therefor netpoll stuff and RTNL (which is a mutex),
> really don't mix well.  Keep RTNL for what it was meant for network
> reconfiguration. Don't turn it into a network special BKL.
> 

Hmm, I think for my patch, holding RTNL lock is not necessary,
because there're no other call pathes to change IFF_IN_NETPOLL bit,
which is unlike bonding or bridge cases where sysfs/ioctl is provided
to change it.

The only chance to change IFF_IN_NETPOLL is in netpoll_send_skb()
which can't be called simultaneously because there are other locks
protecting it.

Or am I still missing something?

Thanks.

^ permalink raw reply

* Intel Pro1000 on CN5020, hangs Uboot
From: Jack Daniel @ 2010-04-14  8:18 UTC (permalink / raw)
  To: netdev; +Cc: davem

Hi,

I have a OCTEON CN5020, on which I tried plugging in an Intel Pro 1000
PCI NIC. But Uboot hangs with a trap exception. Uboot has no problems
with a Realtek RTL8139 NIC PCI card that supports 100MBps. Could
someone tell me the reason for such a behaviour?

Regards,
Jack


Uboot Version (as reported by Uboot) : U-Boot 1.1.1 (Development
build, svnversion: u-boot:47725, exec:47725)
Uboot start up message:
PAL rev: 1.01, MCU rev: 1.07, CPU voltage: 1.20
DRAM:  512 MB
Clearing DRAM....... done
Flash:  8 MB
BIST check passed.
Starting PCI
PCI Status: PCI 32-bit
Reg: 0x0 0x0
Reg: 0x1 0x62
Reg: 0x2 0x0
Reg: 0x3 0x0
Reg: 0x4 0x0
Reg: 0x5 0xE
Reg: 0x6 0x1
Reg: 0x7 0xFFFFFFFFC00D5C70
Reg: 0x8 0x800119040000000E
Reg: 0x9 0x61784
Reg: 0xA 0xFFFFFFFFC0000A4C
Reg: 0xB 0xFFFFFFFFC0062208
Reg: 0xC 0xFFFFFFFFC0062550
Reg: 0xD 0xFFFFFFFFC005E4A0
Reg: 0xE 0x20
Reg: 0xF 0x0
Reg: 0x10 0xFFFFFFFFC00D5CA0
Reg: 0x11 0xFFFFFFFFC008F490
Reg: 0x12 0x0
Reg: 0x13 0x0
Reg: 0x14 0x0
Reg: 0x15 0x0
Reg: 0x16 0xFF00
Reg: 0x17 0xFFFFFFFFC00D5CA6
Reg: 0x18 0xFFFFFFFFC0062550
Reg: 0x19 0xFFFFFFFFC0038E24
Reg: 0x1A 0xFFFFFFFFFFFF8000
Reg: 0x1B 0x30
Reg: 0x1C 0xFFFFFFFFBFC621D0
Reg: 0x1D 0xFFFFFFFFFFFF97F8
Reg: 0x1E 0xFFFFFFFFC00D5CA4
Reg: 0x1F 0xFFFFFFFFBFC00AE0
status:   0x504000E7
cause:    0x4000801C
epc:      0xFFFFFFFFC0038EC8
badvaddr: 0x0

^ permalink raw reply

* Re: usb-sound circular locking again?
From: Richard Zidlicky @ 2010-04-14  8:26 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Andrew Morton, linux-kernel, netdev
In-Reply-To: <s5h4oje5zl9.wl%tiwai@suse.de>

Hi,

> > is this the same old issue?
> 
> I think so.  It appears relatively new since a sysfs lockdep check was
> introduced.

you are right, it was definitely my impression that this particular instance is 
a new (last previously tested 2.6.32.8).
After a few more tests it appears to be 100% repeatable in pm-hibernate. Simply 
doing "sync" right now does nothing.

Richard

^ permalink raw reply

* Re: [PATCH net-next-2.6] fasync: RCU locking
From: Lai Jiangshan @ 2010-04-14  8:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, Paul E. McKenney, netdev, linux-kernel
In-Reply-To: <1271230961.16881.630.camel@edumazet-laptop>

Eric Dumazet wrote:
> -void __kill_fasync(struct fasync_struct *fa, int sig, int band)
> +/*
> + * rcu_read_lock() is held
> + */
> +static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
>  {
>  	while (fa) {
>  		struct fown_struct * fown;
> @@ -719,22 +728,19 @@ void __kill_fasync(struct fasync_struct *fa, int sig, int band)
>  		   mechanism. */
>  		if (!(sig == SIGURG && fown->signum == 0))
>  			send_sigio(fown, fa->fa_fd, band);
> -		fa = fa->fa_next;
> +		fa = rcu_dereference(fa->fa_next);
>  	}
>  }
>  

Since rcu_read_lock() protects fasync_struct *fa for us, we can access
to @fa safely even fasync_remove_entry() is just called.

But this patch does not ensure 'fa->fa_file is not freed' nor
'fa->fa_fd is not released', so kill_fasync_rcu() may do wrong thing
if there is no other code ensure it.

^ permalink raw reply

* [PATCH v3] net: batch skb dequeueing from softnet input_pkt_queue
From: Changli Gao @ 2010-04-14  9:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: Eric Dumazet, netdev, Changli Gao

batch skb dequeueing from softnet input_pkt_queue

batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
contention and irq disabling/enabling.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/linux/netdevice.h |    1 
 net/core/dev.c            |   56 ++++++++++++++++++++++++++++++++--------------
 2 files changed, 40 insertions(+), 17 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d1a21b5..898bc62 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1335,6 +1335,7 @@ struct softnet_data {
 	struct call_single_data	csd ____cacheline_aligned_in_smp;
 #endif
 	struct sk_buff_head	input_pkt_queue;
+	struct sk_buff_head	processing_queue;
 	struct napi_struct	backlog;
 };
 
diff --git a/net/core/dev.c b/net/core/dev.c
index a10a216..c635a71 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -131,6 +131,7 @@
 #include <linux/random.h>
 #include <trace/events/napi.h>
 #include <linux/pci.h>
+#include <linux/stop_machine.h>
 
 #include "net-sysfs.h"
 
@@ -2332,6 +2333,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu)
 {
 	struct softnet_data *queue;
 	unsigned long flags;
+	u32 qlen;
 
 	queue = &per_cpu(softnet_data, cpu);
 
@@ -2339,8 +2341,9 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu)
 	__get_cpu_var(netdev_rx_stat).total++;
 
 	rps_lock(queue);
-	if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
-		if (queue->input_pkt_queue.qlen) {
+	qlen = queue->input_pkt_queue.qlen + queue->processing_queue.qlen;
+	if (qlen <= netdev_max_backlog) {
+		if (qlen) {
 enqueue:
 			__skb_queue_tail(&queue->input_pkt_queue, skb);
 			rps_unlock(queue);
@@ -2791,19 +2794,31 @@ int netif_receive_skb(struct sk_buff *skb)
 EXPORT_SYMBOL(netif_receive_skb);
 
 /* Network device is going away, flush any packets still pending  */
-static void flush_backlog(void *arg)
+static void __flush_backlog(struct sk_buff_head *head, struct net_device *dev)
 {
-	struct net_device *dev = arg;
-	struct softnet_data *queue = &__get_cpu_var(softnet_data);
 	struct sk_buff *skb, *tmp;
 
-	rps_lock(queue);
-	skb_queue_walk_safe(&queue->input_pkt_queue, skb, tmp)
+	skb_queue_walk_safe(head, skb, tmp) {
 		if (skb->dev == dev) {
-			__skb_unlink(skb, &queue->input_pkt_queue);
+			__skb_unlink(skb, head);
 			kfree_skb(skb);
 		}
-	rps_unlock(queue);
+	}
+}
+
+static int flush_backlog(void *arg)
+{
+	struct net_device *dev = arg;
+	struct softnet_data *queue;
+	int cpu;
+
+	for_each_online_cpu(cpu) {
+		queue = &per_cpu(softnet_data, cpu);
+		__flush_backlog(&queue->input_pkt_queue, dev);
+		__flush_backlog(&queue->processing_queue, dev);
+	}
+
+	return 0;
 }
 
 static int napi_gro_complete(struct sk_buff *skb)
@@ -3118,20 +3133,23 @@ static int process_backlog(struct napi_struct *napi, int quota)
 
 		local_irq_disable();
 		rps_lock(queue);
-		skb = __skb_dequeue(&queue->input_pkt_queue);
-		if (!skb) {
+		skb_queue_splice_tail_init(&queue->input_pkt_queue,
+					   &queue->processing_queue);
+		if (skb_queue_empty(&queue->processing_queue)) {
 			__napi_complete(napi);
 			rps_unlock(queue);
 			local_irq_enable();
-			break;
+			return work;
 		}
 		rps_unlock(queue);
 		local_irq_enable();
 
-		__netif_receive_skb(skb);
-	} while (++work < quota && jiffies == start_time);
-
-	return work;
+		while ((skb = __skb_dequeue(&queue->processing_queue))) {
+			__netif_receive_skb(skb);
+			if (++work >= quota || jiffies != start_time)
+				return work;
+		}
+	} while (1);
 }
 
 /**
@@ -5027,7 +5045,7 @@ void netdev_run_todo(void)
 
 		dev->reg_state = NETREG_UNREGISTERED;
 
-		on_each_cpu(flush_backlog, dev, 1);
+		stop_machine(flush_backlog, dev, NULL);
 
 		netdev_wait_allrefs(dev);
 
@@ -5487,6 +5505,9 @@ static int dev_cpu_callback(struct notifier_block *nfb,
 	raise_softirq_irqoff(NET_TX_SOFTIRQ);
 	local_irq_enable();
 
+	while ((skb = __skb_dequeue(&oldsd->processing_queue)))
+		netif_rx(skb);
+
 	/* Process offline CPU's input_pkt_queue */
 	while ((skb = __skb_dequeue(&oldsd->input_pkt_queue)))
 		netif_rx(skb);
@@ -5709,6 +5730,7 @@ static int __init net_dev_init(void)
 
 		queue = &per_cpu(softnet_data, i);
 		skb_queue_head_init(&queue->input_pkt_queue);
+		skb_queue_head_init(&queue->processing_queue);
 		queue->completion_queue = NULL;
 		INIT_LIST_HEAD(&queue->poll_list);
 

^ permalink raw reply related

* Re: [PATCH] forcedeth: fix tx limit2 flag check
From: stephen mulcahy @ 2010-04-14 10:14 UTC (permalink / raw)
  To: Ayaz Abdulla
  Cc: David Miller, eric.dumazet@gmail.com, bhutchings@solarflare.com,
	netdev@vger.kernel.org, ben@decadent.org.uk,
	572201@bugs.debian.org
In-Reply-To: <4BC5532E.7000302@nvidia.com>

Ayaz Abdulla wrote:
> This patch fixes the TX_LIMIT feature flag. The previous logic check for 
> TX_LIMIT2 also took into account a device that only had TX_LIMIT set.
> 
> Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
> 
> This is a fix for bug 572201 @ bugs.debian.org

Hi,

Thanks! I'll rebuild my Debian kernel with this and run a test today.

-stephen

^ permalink raw reply

* Re: HTB - What's the minimal value for 'rate' parameter?
From: Antonio Almeida @ 2010-04-14 10:22 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, kaber, davem, devik
In-Reply-To: <20100409212657.GA3560@del.dom.local>

What do you mean with "1:2 has grandchildren with overflown rate tables"?
I couldn't understand your idea. Is there any mistake in the
configuration I sent?
How would you set rates for this particular example?

Regards
  Antonio Almeida



On Fri, Apr 9, 2010 at 10:26 PM, Jarek Poplawski wrote:
> On Fri, Apr 09, 2010 at 04:40:44PM +0100, Antonio Almeida wrote:
>> So, what about the rate limit miss?
>> As you can see the ceil of class 1:2 is set to 4096Kbit but its
>> sending rate is actually 8071Kbit!
>> It looks like classes 1:10 and 1:11 are ignoring hierarchical rate
>> restrictions of class 1:2
>> Here:
>> class htb 1:2 parent 1:1 rate 4096Kbit ceil 4096Kbit burst 3655b cburst 3655b
>>  Sent 84285894 bytes 55671 pkt (dropped 0, overlimits 0 requeues 0)
>>  rate 8071Kbit 666pps backlog 0b 0p requeues 0
>>  lended: 0 borrowed: 0 giants: 0
>>  tokens: -937499999 ctokens: -937499999
>
> Yes, since 1:2 has grandchildren with overflown rate tables, they
> could behave as if they had set rates higher than their parents or
> grandparent (and HTB doesn't restrict it hierarchically).
>
> Jarek P.
>

^ permalink raw reply

* rps perfomance WAS(Re: rps: question
From: jamal @ 2010-04-14 11:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, netdev, robert, David Miller, Changli Gao,
	Andi Kleen
In-Reply-To: <1265641748.3688.56.camel@bigi>

Following up like promised:

On Mon, 2010-02-08 at 10:09 -0500, jamal wrote:
> On Sun, 2010-02-07 at 21:58 -0800, Tom Herbert wrote:
> 
> > I don't have specific numbers, although we are using this on
> > application doing forwarding and numbers seem in line with what we see
> > for an end host.
> > 
> 
> When i get the chance i will give it a run. I have access to an i7
> somewhere. It seems like i need some specific nics?

I did step #0 last night on an i7 (single Nehalem). I think more than
anything i was impressed by the Nehalem's excellent caching system.
Robert, I am almost tempted to say skb recycling performance will be
excellent on this  machine given the cost of a cache miss is much lower
than previous generation hardware.

My test was simple: irq affinity on cpu0(core0) and rps redirection to
cpu1(core 1); tried also to redirect to different SMT threads (aka CPUs)
on different cores with similar results. I base tested against no rps
being used and a kernel which didnt have any RPS config on.
[BTW, I had to hand-edit the .config since i couldnt do it from
menuconfig (Is there any reason for it to be so?)]

Traffic was sent from another machine into the i7 via an el-cheapo sky2
(dont know how shitty this NIC is, but it seems to know how to do MSI so
probably capable of multiqueueing); the test was several sets of 
a ping first and then a ping -f (I will get more sophisticated in my
next test likely this weekend).

Results:
CPU utilization was about 20-30% higher in the case of rps. On cpu0, the
cpu was being chewed highly by sky2_poll and on the redirected-to-core
it was always smp_call_function_single.
Latency was (consistently) on average 5 microseconds. 
So if i sent 1M ping -f packets, without RPS it took on average
176 seconds and with RPS it took 181 seconds to do a round-trip.
Throughput didnt change but this could be attributed to the low amounts
of data i was sending.
I observed that we were generating, on average, an IPI per packet even
with ping -f. (added an extra stat to record when we sent an IPI and
counted against the number of packets sent).
In my opinion it is these IPIs that contribute the most to the latency
and i think it happens that the Nehalem is just highly improved in this 
area. I wish i had a more commonly used machine to test rps on.
I expect that rps will perform worse on currently cheaper/older hardware
for the traffic characteristic i tested.

On IPIs:
Is anyone familiar with what is going on with Nehalem? Why is it this
good? I expect things will get a lot nastier with other hardware like
xeon based or even Nehalem with rps going across QPI.
Here's why i think IPIs are bad, please correct me if i am wrong:
- they are synchronous. i.e an IPI issuer has to wait for an ACK (which
is in the form of an IPI).
- data cache has to be synced to main memory
- the instruction pipeline is flushed
- what else did i miss? Andi?

So my question to Tom, Eric and Changli or anyone else who has been
running RPS:
What hardware did you use? Is there anyone using older hardware than
say AMD Opteron or Intel Nehalem?

My impressions of rps so far:
I think i may end up being impressed when i generate a lot more traffic
since the cost of IPI will be amortized. 
At this point multiqueue seems a lot more impressive alternative and it
seems to me multiqueu hardware is a lot more commodity (price-point)
than a Nehalem.

Plan:
I plan to still attack the app space (and write a basic udp app that
binds to one or more rps cpus and try blasting a lot of UDP traffic to
see what happens) my step after that is to move to forwarding tests..
 
cheers,
jamal


^ permalink raw reply

* Re: [PATCH] tun: orphan an skb on tx
From: David Miller @ 2010-04-14 11:55 UTC (permalink / raw)
  To: herbert
  Cc: eric.dumazet, mst, jan.kiszka, paul.moore, David.Woodhouse,
	netdev, linux-kernel, qemu-devel
In-Reply-To: <20100414005822.GD18044@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 14 Apr 2010 08:58:22 +0800

> On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
>>
>> Herbert Acked your patch, so I guess its OK, but I think it can be
>> dangerous.
> 
> The tun socket accounting was never designed to stop it from
> flooding another tun interface.  It's there to stop it from
> transmitting above a destination interface TX bandwidth and
> cause unnecessary packet drops.  It also limits the total amount
> of kernel memory that can be pinned down by a single tun interface.
> 
> In this case, all we're doing is shifting the accounting from the
> "hardware" queue to the qdisc queue.
> 
> So your ability to flood a tun interface is essentially unchanged.
> 
> BTW we do the same thing in a number of hardware drivers, as well
> as virtio-net.

Right.  Although this reminds me about the whole SKB
orphaning on xmit issue that keeps coming back to haunt
us.

If there weren't odd references to the SKB's socket in
the packet scheduler et al. we could just orphan these
things right upon entry to the qdisc and not have to
add hacks like this to every driver.

In fact... maybe we can just do it in dev_hard_queue_xmit()
since we are out of the qdisc at that point.... but I guess
there might be weird drivers that want the SKB socket in
their ->xmit routine...  Ho hum.

In any event that's net-next-2.6 exploratory material, and I've
applied this patch to net-2.6, thanks!

^ permalink raw reply

* Re: [PATCH v3] net: batch skb dequeueing from softnet input_pkt_queue
From: jamal @ 2010-04-14 11:58 UTC (permalink / raw)
  To: Changli Gao; +Cc: David S. Miller, Eric Dumazet, netdev
In-Reply-To: <1271238738-8386-1-git-send-email-xiaosuo@gmail.com>

On Wed, 2010-04-14 at 17:52 +0800, Changli Gao wrote:
> batch skb dequeueing from softnet input_pkt_queue
> 
> batch skb dequeueing from softnet input_pkt_queue to reduce potential lock
> contention and irq disabling/enabling.
> 
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>

It seems we are now going to generate a lot more IPIs with such a
change. At least this is what i am imagining.
CPU0: packet comes in,queue empty, generate an IPI to CPU1
CPU0: second packet comes in, enqueue
CPU1: grab two packets to process and run with them
CPU0: packet comes in,queue empty, generate an IPI to CPU1
..
...
.....

IPIs add to latency (refer to my other email). Did you test this
to reach some conclusion that it improves thing or was it just by
inspection?

cheers,
jamal


^ permalink raw reply

* Re: [RFC] random SYN drops causing connect() delays
From: Lennart Schulte @ 2010-04-14 11:37 UTC (permalink / raw)
  To: tgraf; +Cc: netdev
In-Reply-To: <20100412080633.GA27418@bombadil.infradead.org>

Hi,
this is very similar to what i have noticed, but up to now I couldn't figure out where it came from. 
Thanks very much for clearing it up!

> I have been tracking down an issue commonly referred to as the 3-sec
> connect() delay. It exists since recent 2.6.x kernels and has never
> been fixed even though it disappeared in recent releases unless
> sched_child_runs_first is set to 1 again.
>
> What happens is that if a client attemps to open many connections to
> a socket with only minimal delay inbetween attemps some SYNs are
> randomly dropped on the server side causing the client to resend after
> the 3 sec TCP timeout and thus causing connect()s to be randomly delayed.
>
> Facts:
>  - Issue can be reproduced over loopback or real networks.
>  - Enabling SO_LINGER on the client side will make the issue disappear!!
>  - While the issue is appearing, the acceptq seems to be overflowing. Both
>    LISTENOVERFLOWS and LISTENDROPS are increasing although not by the exact
>    number of delay occurences. inetdiag reports sk_max_ack_backlog to be 0
>    therefore one possibility that comes to mind is that sk_ack_backlog
>    underflows due to a race.
>  - The issue disappeared in recent kernels, I bisected it down to the following
>    commit:
> 	commit 2bba22c50b06abe9fd0d23933b1e64d35b419262
> 	Author: Mike Galbraith <efault@gmx.de>
> 	Date:   Wed Sep 9 15:41:37 2009 +0200
>
> 	    sched: Turn off child_runs_first
> 	    
> 	    Set child_runs_first default to off.
>
>    Setting kernel.sched_child_runs_first=1 makes the isssue reappear in recent
>    kernels.  This hardens the theory of a race condition.
>  - It looks like that the issue can only be reproduced if the server
>    socket sends out data immediately after the connection has been established
>    but I cannot proof this theory.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox