* [PATCH]atl1c:Do not call cancel_work_sync from the work itself
@ 2009-08-04 2:19 jie.yang
2009-08-04 2:32 ` David Miller
2009-08-04 7:52 ` Andrew Morton
0 siblings, 2 replies; 3+ messages in thread
From: jie.yang @ 2009-08-04 2:19 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-kernel, jie yang
Do not call cancel_work_sync from the work itself, for it my cause
recursive locking.
detail info:
events/1/10 is trying to acquire lock:
(&adapter->reset_task){+.+...}, at: [<c043e384>] __cancel_work_timer+0x80/0x187
but task is already holding lock:
(&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
other info that might help us debug this:
2 locks held by events/1/10:
#0: (events){+.+.+.}, at: [<c043ed6f>] worker_thread+0x127/0x234
#1: (&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
stack backtrace:
Pid: 10, comm: events/1 Not tainted 2.6.31-rc2 #12
Call Trace:
[<c04519c9>] validate_chain+0x4ae/0xb26
[<c0451e8d>] ? validate_chain+0x972/0xb26
[<c0437f4b>] ? lock_timer_base+0x1f/0x3e
[<c04526f8>] __lock_acquire+0x6b7/0x745
[<c0452816>] lock_acquire+0x90/0xad
[<c043e384>] ? __cancel_work_timer+0x80/0x187
[<c043e3b1>] __cancel_work_timer+0xad/0x187
[<c043e384>] ? __cancel_work_timer+0x80/0x187
[<c044f6b3>] ? mark_held_locks+0x3d/0x58
[<c0690855>] ? _spin_unlock_irqrestore+0x36/0x3c
[<c044f7d5>] ? trace_hardirqs_on_caller+0x107/0x12f
[<c044f808>] ? trace_hardirqs_on+0xb/0xd
[<c0437fb2>] ? try_to_del_timer_sync+0x48/0x4f
[<c043e4a2>] cancel_work_sync+0xa/0xc
[<f8955f95>] atl1c_down+0x1f/0xde [atl1c]
[<f8956955>] atl1c_reset_task+0x1f/0x31 [atl1c]
[<c043edae>] worker_thread+0x166/0x234
[<c043ed6f>] ? worker_thread+0x127/0x234
[<f8956936>] ? atl1c_reset_task+0x0/0x31 [atl1c]
[<c0441ac5>] ? autoremove_wake_function+0x0/0x33
[<c043ec48>] ? worker_thread+0x0/0x234
[<c0441a25>] kthread+0x69/0x70
[<c04419bc>] ? kthread+0x0/0x70
[<c04034b7>] kernel_thread_helper+0x7/0x10
So when atl1c_reset_task be scheduled just set a flag in ctrl_flag,
to demonstrate it is in reset_task, when atl1c_down is call it will not call
cancel_work_sync(&adapter->reset_task) if it sees the flag.
Signed-off-by: jie yang <jie.yang@atheros.com>
---
diff --git a/drivers/net/atl1c/atl1c.h b/drivers/net/atl1c/atl1c.h
index 2a1120a..53242dc 100644
--- a/drivers/net/atl1c/atl1c.h
+++ b/drivers/net/atl1c/atl1c.h
@@ -427,6 +427,7 @@ struct atl1c_hw {
#define ATL1C_ASPM_CTRL_MON 0x0200
#define ATL1C_HIB_DISABLE 0x0400
#define ATL1C_LINK_CAP_1000M 0x0800
+#define ATL1C_RESET_IN_WORK 0x1000
#define ATL1C_FPGA_VERSION 0x8000
u16 cmb_tpd;
u16 cmb_rrd;
diff --git a/drivers/net/atl1c/atl1c_main.c b/drivers/net/atl1c/atl1c_main.c
index 1d601ce..dec88fa 100644
--- a/drivers/net/atl1c/atl1c_main.c
+++ b/drivers/net/atl1c/atl1c_main.c
@@ -321,7 +321,10 @@ static void atl1c_del_timer(struct atl1c_adapter *adapter)
static void atl1c_cancel_work(struct atl1c_adapter *adapter)
{
- cancel_work_sync(&adapter->reset_task);
+ if (adapter->hw.ctrl_flags & ATL1C_RESET_IN_WORK)
+ adapter->hw.ctrl_flags &= ~ATL1C_RESET_IN_WORK;/* clear the flag */
+ else
+ cancel_work_sync(&adapter->reset_task);
cancel_work_sync(&adapter->link_chg_task);
}
@@ -1544,6 +1547,7 @@ static irqreturn_t atl1c_intr(int irq, void *data)
/* reset MAC */
hw->intr_mask &= ~ISR_ERROR;
AT_WRITE_REG(hw, REG_IMR, hw->intr_mask);
+ adapter->hw.ctrl_flags |= ATL1C_RESET_IN_WORK;
schedule_work(&adapter->reset_task);
break;
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH]atl1c:Do not call cancel_work_sync from the work itself
2009-08-04 2:19 [PATCH]atl1c:Do not call cancel_work_sync from the work itself jie.yang
@ 2009-08-04 2:32 ` David Miller
2009-08-04 7:52 ` Andrew Morton
1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2009-08-04 2:32 UTC (permalink / raw)
To: jie.yang; +Cc: netdev, linux-kernel
From: <jie.yang@atheros.com>
Date: Tue, 4 Aug 2009 10:19:06 +0800
> Do not call cancel_work_sync from the work itself, for it my cause
> recursive locking.
Sorry, this won't work.
Your patches, all of them, are corrupted by your email client. It
turns tab characters into spaces as well as making other textual
modifications to the patch, such that it will not apply.
Please fix this up and resubmit all of your patches when the problem
is corrected.
If you have to, email the patches to yourself and try to apply them
just to make sure.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH]atl1c:Do not call cancel_work_sync from the work itself
2009-08-04 2:19 [PATCH]atl1c:Do not call cancel_work_sync from the work itself jie.yang
2009-08-04 2:32 ` David Miller
@ 2009-08-04 7:52 ` Andrew Morton
1 sibling, 0 replies; 3+ messages in thread
From: Andrew Morton @ 2009-08-04 7:52 UTC (permalink / raw)
To: jie.yang; +Cc: davem, netdev, linux-kernel
On Tue, 4 Aug 2009 10:19:06 +0800 <jie.yang@atheros.com> wrote:
> Do not call cancel_work_sync from the work itself, for it my cause
> recursive locking.
> detail info:
> events/1/10 is trying to acquire lock:
> (&adapter->reset_task){+.+...}, at: [<c043e384>] __cancel_work_timer+0x80/0x187
>
> but task is already holding lock:
> (&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
> other info that might help us debug this:
>
> 2 locks held by events/1/10:
> #0: (events){+.+.+.}, at: [<c043ed6f>] worker_thread+0x127/0x234
> #1: (&adapter->reset_task){+.+...}, at: [<c043ed6f>] worker_thread+0x127/0x234
>
> stack backtrace:
> Pid: 10, comm: events/1 Not tainted 2.6.31-rc2 #12
> Call Trace:
> [<c04519c9>] validate_chain+0x4ae/0xb26
> [<c0451e8d>] ? validate_chain+0x972/0xb26
> [<c0437f4b>] ? lock_timer_base+0x1f/0x3e
> [<c04526f8>] __lock_acquire+0x6b7/0x745
> [<c0452816>] lock_acquire+0x90/0xad
> [<c043e384>] ? __cancel_work_timer+0x80/0x187
> [<c043e3b1>] __cancel_work_timer+0xad/0x187
> [<c043e384>] ? __cancel_work_timer+0x80/0x187
> [<c044f6b3>] ? mark_held_locks+0x3d/0x58
> [<c0690855>] ? _spin_unlock_irqrestore+0x36/0x3c
> [<c044f7d5>] ? trace_hardirqs_on_caller+0x107/0x12f
> [<c044f808>] ? trace_hardirqs_on+0xb/0xd
> [<c0437fb2>] ? try_to_del_timer_sync+0x48/0x4f
> [<c043e4a2>] cancel_work_sync+0xa/0xc
> [<f8955f95>] atl1c_down+0x1f/0xde [atl1c]
> [<f8956955>] atl1c_reset_task+0x1f/0x31 [atl1c]
> [<c043edae>] worker_thread+0x166/0x234
> [<c043ed6f>] ? worker_thread+0x127/0x234
> [<f8956936>] ? atl1c_reset_task+0x0/0x31 [atl1c]
> [<c0441ac5>] ? autoremove_wake_function+0x0/0x33
> [<c043ec48>] ? worker_thread+0x0/0x234
> [<c0441a25>] kthread+0x69/0x70
> [<c04419bc>] ? kthread+0x0/0x70
> [<c04034b7>] kernel_thread_helper+0x7/0x10
>
> So when atl1c_reset_task be scheduled just set a flag in ctrl_flag,
> to demonstrate it is in reset_task, when atl1c_down is call it will not call
> cancel_work_sync(&adapter->reset_task) if it sees the flag.
>
> Signed-off-by: jie yang <jie.yang@atheros.com>
> ---
> diff --git a/drivers/net/atl1c/atl1c.h b/drivers/net/atl1c/atl1c.h
> index 2a1120a..53242dc 100644
> --- a/drivers/net/atl1c/atl1c.h
> +++ b/drivers/net/atl1c/atl1c.h
> @@ -427,6 +427,7 @@ struct atl1c_hw {
> #define ATL1C_ASPM_CTRL_MON 0x0200
> #define ATL1C_HIB_DISABLE 0x0400
> #define ATL1C_LINK_CAP_1000M 0x0800
> +#define ATL1C_RESET_IN_WORK 0x1000
> #define ATL1C_FPGA_VERSION 0x8000
> u16 cmb_tpd;
> u16 cmb_rrd;
> diff --git a/drivers/net/atl1c/atl1c_main.c b/drivers/net/atl1c/atl1c_main.c
> index 1d601ce..dec88fa 100644
> --- a/drivers/net/atl1c/atl1c_main.c
> +++ b/drivers/net/atl1c/atl1c_main.c
> @@ -321,7 +321,10 @@ static void atl1c_del_timer(struct atl1c_adapter *adapter)
>
> static void atl1c_cancel_work(struct atl1c_adapter *adapter)
> {
> - cancel_work_sync(&adapter->reset_task);
> + if (adapter->hw.ctrl_flags & ATL1C_RESET_IN_WORK)
> + adapter->hw.ctrl_flags &= ~ATL1C_RESET_IN_WORK;/* clear the flag */
> + else
> + cancel_work_sync(&adapter->reset_task);
> cancel_work_sync(&adapter->link_chg_task);
> }
>
> @@ -1544,6 +1547,7 @@ static irqreturn_t atl1c_intr(int irq, void *data)
> /* reset MAC */
> hw->intr_mask &= ~ISR_ERROR;
> AT_WRITE_REG(hw, REG_IMR, hw->intr_mask);
> + adapter->hw.ctrl_flags |= ATL1C_RESET_IN_WORK;
> schedule_work(&adapter->reset_task);
> break;
> }
The use of non-atomic bit operations upon ctrl_flags looks unsafe. If
a CPU is running atl1c_cancel_work() and then an interrupt happens (on
this CPU or a different one), then the interrupt will set
ATL1C_RESET_IN_WORK. But the non-atomic operation in
atl1c_cancel_work() can just overwrite that change.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-08-04 7:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-04 2:19 [PATCH]atl1c:Do not call cancel_work_sync from the work itself jie.yang
2009-08-04 2:32 ` David Miller
2009-08-04 7:52 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).