* [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
@ 2010-11-12 13:55 Joakim Tjernlund
2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
0 siblings, 2 replies; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-12 13:55 UTC (permalink / raw)
To: linuxppc-dev, netdev, Anton Vorontsov
ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
to stop any outstanding processing of TX fail. However, one
can not call cancel_work_sync without fixing the timeout function
otherwise it will deadlock. This patch brings ucc_geth in line with
gianfar:
Don't bring the interface down and up, just reinit controller HW
and PHY.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
drivers/net/ucc_geth.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 97f9f7d..6c254ed 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2065,9 +2065,6 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
/* Disable Rx and Tx */
clrbits32(&ug_regs->maccfg1, MACCFG1_ENABLE_RX | MACCFG1_ENABLE_TX);
- phy_disconnect(ugeth->phydev);
- ugeth->phydev = NULL;
-
ucc_geth_memclean(ugeth);
}
@@ -3556,7 +3553,10 @@ static int ucc_geth_close(struct net_device *dev)
napi_disable(&ugeth->napi);
+ cancel_work_sync(&ugeth->timeout_work);
ucc_geth_stop(ugeth);
+ phy_disconnect(ugeth->phydev);
+ ugeth->phydev = NULL;
free_irq(ugeth->ug_info->uf_info.irq, ugeth->ndev);
@@ -3585,8 +3585,12 @@ static void ucc_geth_timeout_work(struct work_struct *work)
* Must reset MAC *and* PHY. This is done by reopening
* the device.
*/
- ucc_geth_close(dev);
- ucc_geth_open(dev);
+ netif_tx_stop_all_queues(dev);
+ ucc_geth_stop(ugeth);
+ ucc_geth_init_mac(ugeth);
+ /* Must start PHY here */
+ phy_start(ugeth->phydev);
+ netif_tx_start_all_queues(dev);
}
netif_tx_schedule_all(dev);
@@ -3600,7 +3604,6 @@ static void ucc_geth_timeout(struct net_device *dev)
{
struct ucc_geth_private *ugeth = netdev_priv(dev);
- netif_carrier_off(dev);
schedule_work(&ugeth->timeout_work);
}
--
1.7.2.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] ucc_geth: Fix deadlock
2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
@ 2010-11-12 13:55 ` Joakim Tjernlund
2010-11-12 14:09 ` Anton Vorontsov
2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
1 sibling, 1 reply; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-12 13:55 UTC (permalink / raw)
To: linuxppc-dev, netdev, Anton Vorontsov
This script:
while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
causes in just a second or two:
INFO: task ifconfig:572 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ifconfig D 0ff65760 0 572 369 0x00000000
Call Trace:
[c6157be0] [c6008460] 0xc6008460 (unreliable)
[c6157ca0] [c0008608] __switch_to+0x4c/0x6c
[c6157cb0] [c028fecc] schedule+0x184/0x310
[c6157ce0] [c0290e54] __mutex_lock_slowpath+0xa4/0x150
[c6157d20] [c0290c48] mutex_lock+0x44/0x48
[c6157d30] [c01aba74] phy_stop+0x20/0x70
[c6157d40] [c01aef40] ucc_geth_stop+0x30/0x98
[c6157d60] [c01b18fc] ucc_geth_close+0x9c/0xdc
[c6157d80] [c01db0cc] __dev_close+0xa0/0xd0
[c6157d90] [c01deddc] __dev_change_flags+0x8c/0x148
[c6157db0] [c01def54] dev_change_flags+0x1c/0x64
[c6157dd0] [c0237ac8] devinet_ioctl+0x678/0x784
[c6157e50] [c0239a58] inet_ioctl+0xb0/0xbc
[c6157e60] [c01cafa8] sock_ioctl+0x174/0x2a0
[c6157e80] [c009a16c] vfs_ioctl+0xcc/0xe0
[c6157ea0] [c009a998] do_vfs_ioctl+0xc4/0x79c
[c6157f10] [c009b0b0] sys_ioctl+0x40/0x74
[c6157f40] [c00117c4] ret_from_syscall+0x0/0x38
The reason appears to be ucc_geth_stop meets adjust_link as the
PHY reports PHY changes. I belive adjust_link hangs somewhere,
holding the PHY lock, because ucc_geth_stop disabled the
controller HW.
Fix is to stop the PHY before disabling the controller.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
drivers/net/ucc_geth.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 6c254ed..06a5db3 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2050,12 +2050,16 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
ugeth_vdbg("%s: IN", __func__);
+ /*
+ * Tell the kernel the link is down.
+ * Must be done before disabling the controller
+ * or deadlock may happen.
+ */
+ phy_stop(phydev);
+
/* Disable the controller */
ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
- /* Tell the kernel the link is down */
- phy_stop(phydev);
-
/* Mask all interrupts */
out_be32(ugeth->uccf->p_uccm, 0x00000000);
--
1.7.2.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
@ 2010-11-12 14:05 ` Anton Vorontsov
2010-11-12 20:24 ` David Miller
1 sibling, 1 reply; 7+ messages in thread
From: Anton Vorontsov @ 2010-11-12 14:05 UTC (permalink / raw)
To: Joakim Tjernlund; +Cc: netdev, linuxppc-dev
On Fri, Nov 12, 2010 at 02:55:08PM +0100, Joakim Tjernlund wrote:
> ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
> to stop any outstanding processing of TX fail. However, one
> can not call cancel_work_sync without fixing the timeout function
> otherwise it will deadlock. This patch brings ucc_geth in line with
> gianfar:
>
> Don't bring the interface down and up, just reinit controller HW
> and PHY.
>
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Looks sane, thanks!
Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] ucc_geth: Fix deadlock
2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
@ 2010-11-12 14:09 ` Anton Vorontsov
2010-11-12 20:25 ` David Miller
2010-11-14 14:43 ` Joakim Tjernlund
0 siblings, 2 replies; 7+ messages in thread
From: Anton Vorontsov @ 2010-11-12 14:09 UTC (permalink / raw)
To: Joakim Tjernlund; +Cc: netdev, linuxppc-dev
On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
> This script:
> while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
> causes in just a second or two:
> INFO: task ifconfig:572 blocked for more than 120 seconds.
[...]
> The reason appears to be ucc_geth_stop meets adjust_link as the
> PHY reports PHY changes. I belive adjust_link hangs somewhere,
> holding the PHY lock, because ucc_geth_stop disabled the
> controller HW.
> Fix is to stop the PHY before disabling the controller.
>
> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
It's unclear where exactly adjust_link() hangs, but the patch
looks as the right thing overall.
Thanks!
Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
> ---
> drivers/net/ucc_geth.c | 10 +++++++---
> 1 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
> index 6c254ed..06a5db3 100644
> --- a/drivers/net/ucc_geth.c
> +++ b/drivers/net/ucc_geth.c
> @@ -2050,12 +2050,16 @@ static void ucc_geth_stop(struct ucc_geth_private *ugeth)
>
> ugeth_vdbg("%s: IN", __func__);
>
> + /*
> + * Tell the kernel the link is down.
> + * Must be done before disabling the controller
> + * or deadlock may happen.
> + */
> + phy_stop(phydev);
> +
> /* Disable the controller */
> ugeth_disable(ugeth, COMM_DIR_RX_AND_TX);
>
> - /* Tell the kernel the link is down */
> - phy_stop(phydev);
> -
> /* Mask all interrupts */
> out_be32(ugeth->uccf->p_uccm, 0x00000000);
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure.
2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
@ 2010-11-12 20:24 ` David Miller
0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2010-11-12 20:24 UTC (permalink / raw)
To: cbouatmailru; +Cc: netdev, linuxppc-dev
From: Anton Vorontsov <cbouatmailru@gmail.com>
Date: Fri, 12 Nov 2010 17:05:15 +0300
> On Fri, Nov 12, 2010 at 02:55:08PM +0100, Joakim Tjernlund wrote:
>> ucc_geth_close lacks a cancel_work_sync(&ugeth->timeout_work)
>> to stop any outstanding processing of TX fail. However, one
>> can not call cancel_work_sync without fixing the timeout function
>> otherwise it will deadlock. This patch brings ucc_geth in line with
>> gianfar:
>>
>> Don't bring the interface down and up, just reinit controller HW
>> and PHY.
>>
>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>
> Looks sane, thanks!
>
> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] ucc_geth: Fix deadlock
2010-11-12 14:09 ` Anton Vorontsov
@ 2010-11-12 20:25 ` David Miller
2010-11-14 14:43 ` Joakim Tjernlund
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2010-11-12 20:25 UTC (permalink / raw)
To: cbouatmailru; +Cc: netdev, linuxppc-dev
From: Anton Vorontsov <cbouatmailru@gmail.com>
Date: Fri, 12 Nov 2010 17:09:47 +0300
> On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
>> This script:
>> while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
>> causes in just a second or two:
>> INFO: task ifconfig:572 blocked for more than 120 seconds.
> [...]
>> The reason appears to be ucc_geth_stop meets adjust_link as the
>> PHY reports PHY changes. I belive adjust_link hangs somewhere,
>> holding the PHY lock, because ucc_geth_stop disabled the
>> controller HW.
>> Fix is to stop the PHY before disabling the controller.
>>
>> Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>
> It's unclear where exactly adjust_link() hangs, but the patch
> looks as the right thing overall.
>
> Thanks!
>
> Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] ucc_geth: Fix deadlock
2010-11-12 14:09 ` Anton Vorontsov
2010-11-12 20:25 ` David Miller
@ 2010-11-14 14:43 ` Joakim Tjernlund
1 sibling, 0 replies; 7+ messages in thread
From: Joakim Tjernlund @ 2010-11-14 14:43 UTC (permalink / raw)
To: Anton Vorontsov; +Cc: netdev, linuxppc-dev
Anton Vorontsov <cbouatmailru@gmail.com> wrote on 2010/11/12 15:09:47:
>
> On Fri, Nov 12, 2010 at 02:55:09PM +0100, Joakim Tjernlund wrote:
> > This script:
> > while [ 1==1 ] ; do ifconfig eth0 up; usleep 1950000 ;ifconfig eth0 down; dmesg -c ;done
> > causes in just a second or two:
> > INFO: task ifconfig:572 blocked for more than 120 seconds.
> [...]
> > The reason appears to be ucc_geth_stop meets adjust_link as the
> > PHY reports PHY changes. I belive adjust_link hangs somewhere,
> > holding the PHY lock, because ucc_geth_stop disabled the
> > controller HW.
> > Fix is to stop the PHY before disabling the controller.
> >
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
>
> It's unclear where exactly adjust_link() hangs, but the patch
> looks as the right thing overall.
Yes, I too cannot find where it is hanging, just that it is hanging somewhere.
I am starting to think it is hanging somewhere else. Anyhow, the hang
goes away 100% when this patch is applied.
Jocke
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-11-14 14:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 13:55 [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Joakim Tjernlund
2010-11-12 13:55 ` [PATCH 2/2] ucc_geth: Fix deadlock Joakim Tjernlund
2010-11-12 14:09 ` Anton Vorontsov
2010-11-12 20:25 ` David Miller
2010-11-14 14:43 ` Joakim Tjernlund
2010-11-12 14:05 ` [PATCH 1/2] ucc_geth: Do not bring the whole IF down when TX failure Anton Vorontsov
2010-11-12 20:24 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).