linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
@ 2010-11-12 13:55 Joakim Tjernlund
  2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
  2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
  0 siblings, 2 replies; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-12 13:55 UTC (permalink / raw)
  To: linuxppc-dev, netdev, Anton Vorontsov

ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
to stop any outstanding processing of TX fail. However, one
can not call cancel_work_sync without fixing the timeout function
otherwise it will deadlock. This patch brings ucc_geth in line with
gianfar:

Don't bring the interface down and up, just reinit controller HW
and PHY.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 drivers/net/ucc_geth.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 97f9f7d..6c254ed 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2065,9 +2065,6 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
 	/* Disable Rx and Tx */
 	clrbits32(&ug_regs->maccfg1, MACCFG1_ENABLE_RX | MACCFG1_ENABLE_TX);
 
-	phy_disconnect(ugeth->phydev);
-	ugeth->phydev = NULL;
-
 	ucc_geth_memclean(ugeth);
 }
 
@@ -3556,7 +3553,10 @@ static int ucc_geth_close(struct net_device *dev)
 
 	napi_disable(&ugeth->napi);
 
+	cancel_work_sync(&ugeth->timeout_work);
 	ucc_geth_stop(ugeth);
+	phy_disconnect(ugeth->phydev);
+	ugeth->phydev = NULL;
 
 	free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
 
@@ -3585,8 +3585,12 @@ static void ucc_geth_timeout_work(struct work_struct *work)
 		 * Must reset MAC *and* PHY. This is done by reopening
 		 * the device.
 		 */
-		ucc_geth_close(dev);
-		ucc_geth_open(dev);
+		netif_tx_stop_all_queues(dev);
+		ucc_geth_stop(ugeth);
+		ucc_geth_init_mac(ugeth);
+		/* Must start PHY here */
+		phy_start(ugeth->phydev);
+		netif_tx_start_all_queues(dev);
 	}
 
 	netif_tx_schedule_all(dev);
@@ -3600,7 +3604,6 @@ static void ucc_geth_timeout(struct net_device *dev)
 {
 	struct ucc_geth_private *ugeth = netdev_priv(dev);
 
-	netif_carrier_off(dev);
 	schedule_work(&ugeth->timeout_work);
 }
 
-- 
1.7.2.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] ucc_geth: Fix deadlock
  2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
@ 2010-11-12 13:55 ` Joakim Tjernlund
  2010-11-12 14:09   ` Anton Vorontsov
  2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
  1 sibling, 1 reply; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-12 13:55 UTC (permalink / raw)
  To: linuxppc-dev, netdev, Anton Vorontsov

This script:
 while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
causes in just a second or two:
INFO: task ifconfig:572 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ifconfig      D 0ff65760     0   572    369 0x00000000
Call Trace:
[c6157be0] [c6008460] 0xc6008460 (unreliable)
[c6157ca0] [c0008608] __switch_to+0x4c/0x6c
[c6157cb0] [c028fecc] schedule+0x184/0x310
[c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
[c6157d20] [c0290c48] mutex_lock+0x44/0x48
[c6157d30] [c01aba74] phy_stop+0x20/0x70
[c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
[c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
[c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
[c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
[c6157db0] [c01def54] dev_change_flags+0x1c/0x64
[c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
[c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
[c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
[c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
[c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
[c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
[c6157f40] [c00117c4] ret_from_syscall+0x0/0x38

The reason appears to be ucc_geth_stop meets adjust_link as the
PHY reports PHY changes. I belive adjust_link hangs somewhere,
holding the PHY lock, because ucc_geth_stop disabled the
controller HW.
Fix is to stop the PHY before disabling the controller.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 drivers/net/ucc_geth.c |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 6c254ed..06a5db3 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2050,12 +2050,16 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
 
 	ugeth_vdbg("%s: IN", __func__);
 
+	/*
+	 * Tell the kernel the link is down.
+	 * Must be done before disabling the controller
+	 * or deadlock may happen.
+	 */
+	phy_stop(phydev);
+
 	/* Disable the controller */
 	ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
 
-	/* Tell the kernel the link is down */
-	phy_stop(phydev);
-
 	/* Mask all interrupts */
 	out_be32(ugeth->uccf->p_uccm, 0x00000000);
 
-- 
1.7.2.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
  2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
  2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
@ 2010-11-12 14:05 ` Anton Vorontsov
  2010-11-12 20:24   ` David Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Anton Vorontsov @ 2010-11-12 14:05 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev, linuxppc-dev

On Fri, Nov 12, 2010 at 02:55:08PM +0100, Joakim Tjernlund wrote:
> ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
> to stop any outstanding processing of TX fail. However, one
> can not call cancel_work_sync without fixing the timeout function
> otherwise it will deadlock. This patch brings ucc_geth in line with
> gianfar:
> 
> Don't bring the interface down and up, just reinit controller HW
> and PHY.
> 
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>

Looks sane, thanks!

Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ucc_geth: Fix deadlock
  2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
@ 2010-11-12 14:09   ` Anton Vorontsov
  2010-11-12 20:25     ` David Miller
  2010-11-14 14:43     ` Joakim Tjernlund
  0 siblings, 2 replies; 7+ messages in thread
From: Anton Vorontsov @ 2010-11-12 14:09 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev, linuxppc-dev

On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
> This script:
>  while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
> causes in just a second or two:
> INFO: task ifconfig:572 blocked for more than 120 seconds.
[...]
> The reason appears to be ucc_geth_stop meets adjust_link as the
> PHY reports PHY changes. I belive adjust_link hangs somewhere,
> holding the PHY lock, because ucc_geth_stop disabled the
> controller HW.
> Fix is to stop the PHY before disabling the controller.
> 
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>

It's unclear where exactly adjust_link() hangs, but the patch
looks as the right thing overall.

Thanks!

Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>

> ---
>  drivers/net/ucc_geth.c |   10 +++++++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> index 6c254ed..06a5db3 100644
> --- a/drivers/net/ucc_geth.c
> +++ b/drivers/net/ucc_geth.c
> @@ -2050,12 +2050,16 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
>  
>  	ugeth_vdbg("%s: IN", __func__);
>  
> +	/*
> +	 * Tell the kernel the link is down.
> +	 * Must be done before disabling the controller
> +	 * or deadlock may happen.
> +	 */
> +	phy_stop(phydev);
> +
>  	/* Disable the controller */
>  	ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
>  
> -	/* Tell the kernel the link is down */
> -	phy_stop(phydev);
> -
>  	/* Mask all interrupts */
>  	out_be32(ugeth->uccf->p_uccm, 0x00000000);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
  2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
@ 2010-11-12 20:24   ` David Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2010-11-12 20:24 UTC (permalink / raw)
  To: cbouatmailru; +Cc: netdev, linuxppc-dev

From: Anton Vorontsov <cbouatmailru@gmail.com>
Date: Fri, 12 Nov 2010 17:05:15 +0300

> On Fri, Nov 12, 2010 at 02:55:08PM +0100, Joakim Tjernlund wrote:
>> ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
>> to stop any outstanding processing of TX fail. However, one
>> can not call cancel_work_sync without fixing the timeout function
>> otherwise it will deadlock. This patch brings ucc_geth in line with
>> gianfar:
>> 
>> Don't bring the interface down and up, just reinit controller HW
>> and PHY.
>> 
>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> 
> Looks sane, thanks!
> 
> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ucc_geth: Fix deadlock
  2010-11-12 14:09   ` Anton Vorontsov
@ 2010-11-12 20:25     ` David Miller
  2010-11-14 14:43     ` Joakim Tjernlund
  1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2010-11-12 20:25 UTC (permalink / raw)
  To: cbouatmailru; +Cc: netdev, linuxppc-dev

From: Anton Vorontsov <cbouatmailru@gmail.com>
Date: Fri, 12 Nov 2010 17:09:47 +0300

> On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
>> This script:
>>  while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
>> causes in just a second or two:
>> INFO: task ifconfig:572 blocked for more than 120 seconds.
> [...]
>> The reason appears to be ucc_geth_stop meets adjust_link as the
>> PHY reports PHY changes. I belive adjust_link hangs somewhere,
>> holding the PHY lock, because ucc_geth_stop disabled the
>> controller HW.
>> Fix is to stop the PHY before disabling the controller.
>> 
>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> 
> It's unclear where exactly adjust_link() hangs, but the patch
> looks as the right thing overall.
> 
> Thanks!
> 
> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>

Applied.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] ucc_geth: Fix deadlock
  2010-11-12 14:09   ` Anton Vorontsov
  2010-11-12 20:25     ` David Miller
@ 2010-11-14 14:43     ` Joakim Tjernlund
  1 sibling, 0 replies; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-14 14:43 UTC (permalink / raw)
  To: Anton Vorontsov; +Cc: netdev, linuxppc-dev

Anton Vorontsov <cbouatmailru@gmail.com> wrote on 2010/11/12 15:09:47:
>
> On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
> > This script:
> >  while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
> > causes in just a second or two:
> > INFO: task ifconfig:572 blocked for more than 120 seconds.
> [...]
> > The reason appears to be ucc_geth_stop meets adjust_link as the
> > PHY reports PHY changes. I belive adjust_link hangs somewhere,
> > holding the PHY lock, because ucc_geth_stop disabled the
> > controller HW.
> > Fix is to stop the PHY before disabling the controller.
> >
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>
> It's unclear where exactly adjust_link() hangs, but the patch
> looks as the right thing overall.

Yes, I too cannot find where it is hanging, just that it is hanging somewhere.
I am starting to think it is hanging somewhere else. Anyhow, the hang
goes away 100% when this patch is applied.

 Jocke

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-14 14:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
2010-11-12 14:09   ` Anton Vorontsov
2010-11-12 20:25     ` David Miller
2010-11-14 14:43     ` Joakim Tjernlund
2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
2010-11-12 20:24   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).