linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* recursive locking on wireless-testing.
@ 2010-11-17 21:19 Ben Greear
  2010-11-18  0:37 ` Felix Fietkau
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Greear @ 2010-11-17 21:19 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org

I found this while testing wpa_supplicant that shares scan results.
The kernel has no scan-sharing hacks in it..just a few patches
I've been using for a while (and the deadlock prevention patch
previously mentioned in other threads).


Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54

=============================================
[ INFO: possible recursive locking detected ]
2.6.37-rc1-wl+ #48
---------------------------------------------
wpa_supplicant/12334 is trying to acquire lock:
  (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe90aa>] ath_tx_complete_buf+0x1d4/0x26c [ath9k]

but task is already holding lock:
  (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe9ce6>] ath_tx_flush_tid+0x41/0xb6 [ath9k]

other info that might help us debug this:
6 locks held by wpa_supplicant/12334:
  #0:  (rtnl_mutex){+.+.+.}, at: [<786ffe97>] rtnl_lock+0xf/0x11
  #1:  (&wdev->mtx){+.+.+.}, at: [<f8bbc5a9>] cfg80211_wext_siwmlme+0x41/0x85 [cfg80211]
  #2:  (&ifmgd->mtx){+.+.+.}, at: [<f8f1b4ce>] ieee80211_mgd_deauth+0x28/0x1af [mac80211]
  #3:  (&local->sta_mtx){+.+.+.}, at: [<f8f1b150>] ieee80211_set_disassoc+0xab/0x1bc [mac80211]
  #4:  (&sta->ampdu_mlme.mtx){+.+...}, at: [<f8f19096>] __ieee80211_stop_tx_ba_session+0x25/0x4c [mac80211]
  #5:  (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe9ce6>] ath_tx_flush_tid+0x41/0xb6 [ath9k]

stack backtrace:
Pid: 12334, comm: wpa_supplicant Not tainted 2.6.37-rc1-wl+ #48
Call Trace:
  [<7878bf56>] ? printk+0x18/0x1a
  [<7845bb58>] __lock_acquire+0xb14/0xb8b
  [<784593ff>] ? register_lock_class+0x17/0x297
  [<7845bc41>] lock_acquire+0x72/0x8d
  [<f8fe90aa>] ? ath_tx_complete_buf+0x1d4/0x26c [ath9k]
  [<7878de3a>] _raw_spin_lock_bh+0x38/0x45
  [<f8fe90aa>] ? ath_tx_complete_buf+0x1d4/0x26c [ath9k]
  [<f8fe90aa>] ath_tx_complete_buf+0x1d4/0x26c [ath9k]
  [<f8fe9d31>] ath_tx_flush_tid+0x8c/0xb6 [ath9k]
  [<f8fea716>] ath_tx_aggr_stop+0x7e/0x86 [ath9k]
  [<f8fe56bd>] ath9k_ampdu_action+0x93/0xf4 [ath9k]
  [<f8fe562a>] ? ath9k_ampdu_action+0x0/0xf4 [ath9k]
  [<f8f18708>] drv_ampdu_action+0x60/0x68 [mac80211]
  [<f8f18faf>] ___ieee80211_stop_tx_ba_session+0xde/0xfd [mac80211]
  [<f8f190aa>] __ieee80211_stop_tx_ba_session+0x39/0x4c [mac80211]
  [<f8f18683>] ieee80211_sta_tear_down_BA_sessions+0x31/0x56 [mac80211]
  [<f8f1b0a0>] ? set_sta_flags+0x23/0x28 [mac80211]
  [<f8f1b175>] ieee80211_set_disassoc+0xd0/0x1bc [mac80211]
  [<f8f1b4f5>] ieee80211_mgd_deauth+0x4f/0x1af [mac80211]
  [<f8f22cf1>] ieee80211_deauth+0x14/0x16 [mac80211]
  [<f8bb71e9>] __cfg80211_mlme_deauth+0x105/0x10d [cfg80211]
  [<f8bb994e>] __cfg80211_disconnect+0x112/0x199 [cfg80211]
  [<f8bbc5cc>] cfg80211_wext_siwmlme+0x64/0x85 [cfg80211]
  [<7876e089>] ioctl_standard_call+0x1f0/0x28e
  [<786f2b2b>] ? dev_name_hash+0x16/0x48
  [<786f653c>] ? __dev_get_by_name+0x32/0x3d
  [<7876e1b4>] wext_handle_ioctl+0x8d/0x18d
  [<f8bbc568>] ? cfg80211_wext_siwmlme+0x0/0x85 [cfg80211]
  [<786f7669>] dev_ioctl+0x520/0x53f
  [<785977bb>] ? copy_to_user+0x2f/0x108
  [<786e69dc>] ? sys_recvfrom+0xb8/0xc6
  [<786e5d1f>] ? sock_ioctl+0x0/0x202
  [<786e5f15>] sock_ioctl+0x1f6/0x202
  [<786e5d1f>] ? sock_ioctl+0x0/0x202
  [<784cc071>] do_vfs_ioctl+0x56d/0x5c3
  [<784c130d>] ? fcheck_files+0x9b/0xca
  [<784c1369>] ? fget_light+0x2d/0xb0
  [<784cc10a>] sys_ioctl+0x43/0x62
  [<784030dc>] sysenter_do_call+0x12/0x38
Nov 17 13:16:25 ath9k kernel:
Nov 17 13:16:25 ath9k kernel: =============================================
Nov 17 13:16:25 ath9k kernel: [ INFO: possible recursive locking detected ]
Nov 17 13:16:25 ath9k kernel: 2.6.37-rc1-wl+ #48
Nov 17 13:16:25 ath9k kernel: ---------------------------------------------
Nov 17 13:16:25 ath9k kernel: wpa_supplicant/12334 is trying to acquire lock:
Nov 17 13:16:25 ath9k kernel: (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe90aa>] ath_tx_complete_buf+0x1d4/0x26c [ath9k]
Nov 17 13:16:25 ath9k kernel:
Nov 17 13:16:25 ath9k kernel: but task is already holding lock:
Nov 17 13:16:25 ath9k kernel: (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe9ce6>] ath_tx_flush_tid+0x41/0xb6 [ath9k]
Nov 17 13:16:25 ath9k kernel:
Nov 17 13:16:25 ath9k kernel: other info that might help us debug this:
Nov 17 13:16:25 ath9k kernel: 6 locks held by wpa_supplicant/12334:
Nov 17 13:16:25 ath9k kernel: #0:  (rtnl_mutex){+.+.+.}, at: [<786ffe97>] rtnl_lock+0xf/0x11
Nov 17 13:16:25 ath9k kernel: #1:  (&wdev->mtx){+.+.+.}, at: [<f8bbc5a9>] cfg80211_wext_siwmlme+0x41/0x85 [cfg80211]
Nov 17 13:16:25 ath9k kernel: #2:  (&ifmgd->mtx){+.+.+.}, at: [<f8f1b4ce>] ieee80211_mgd_deauth+0x28/0x1af [mac80211]
Nov 17 13:16:25 ath9k kernel: #3:  (&local->sta_mtx){+.+.+.}, at: [<f8f1b150>] ieee80211_set_disassoc+0xab/0x1bc [mac80211]
Nov 17 13:16:25 ath9k kernel: #4:  (&sta->ampdu_mlme.mtx){+.+...}, at: [<f8f19096>] __ieee80211_stop_tx_ba_session+0x25/0x4c [mac80211]
Nov 17 13:16:25 ath9k kernel: #5:  (&(&txq->axq_lock)->rlock){+.-...}, at: [<f8fe9ce6>] ath_tx_flush_tid+0x41/0xb6 [ath9k]
Nov 17 13:16:25 ath9k kernel:
Nov 17 13:16:25 ath9k kernel: stack backtrace:
Nov 17 13:16:25 ath9k kernel: Pid: 12334, comm: wpa_supplicant Not tainted 2.6.37-rc1-wl+ #48
Nov 17 13:16:25 ath9k kernel: Call Trace:
Nov 17 13:16:25 ath9k kernel: [<7878bf56>] ? printk+0x18/0x1a
Nov 17 13:16:25 ath9k kernel: [<7845bb58>] __lock_acquire+0xb14/0xb8b
Nov 17 13:16:25 ath9k kernel: [<784593ff>] ? register_lock_class+0x17/0x297
Nov 17 13:16:25 ath9k kernel: [<7845bc41>] lock_acquire+0x72/0x8d
Nov 17 13:16:25 ath9k kernel: [<f8fe90aa>] ? ath_tx_complete_buf+0x1d4/0x26c [ath9k]
Nov 17 13:16:25 ath9k kernel: [<7878de3a>] _raw_spin_lock_bh+0x38/0x45
Nov 17 13:16:25 ath9k kernel: [<f8fe90aa>] ? ath_tx_complete_buf+0x1d4/0x26c [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8fe90aa>] ath_tx_complete_buf+0x1d4/0x26c [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8fe9d31>] ath_tx_flush_tid+0x8c/0xb6 [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8fea716>] ath_tx_aggr_stop+0x7e/0x86 [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8fe56bd>] ath9k_ampdu_action+0x93/0xf4 [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8fe562a>] ? ath9k_ampdu_action+0x0/0xf4 [ath9k]
Nov 17 13:16:25 ath9k kernel: [<f8f18708>] drv_ampdu_action+0x60/0x68 [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f18faf>] ___ieee80211_stop_tx_ba_session+0xde/0xfd [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f190aa>] __ieee80211_stop_tx_ba_session+0x39/0x4c [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f18683>] ieee80211_sta_tear_down_BA_sessions+0x31/0x56 [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f1b0a0>] ? set_sta_flags+0x23/0x28 [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f1b175>] ieee80211_set_disassoc+0xd0/0x1bc [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f1b4f5>] ieee80211_mgd_deauth+0x4f/0x1af [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8f22cf1>] ieee80211_deauth+0x14/0x16 [mac80211]
Nov 17 13:16:25 ath9k kernel: [<f8bb71e9>] __cfg80211_mlme_deauth+0x105/0x10d [cfg80211]
Nov 17 13:16:25 ath9k kernel: [<f8bb994e>] __cfg80211_disconnect+0x112/0x199 [cfg80211]
Nov 17 13:16:25 ath9k kernel: [<f8bbc5cc>] cfg80211_wext_siwmlme+0x64/0x85 [cfg80211]
Nov 17 13:16:25 ath9k kernel: [<7876e089>] ioctl_standard_call+0x1f0/0x28e
Nov 17 13:16:25 ath9k kernel: [<786f2b2b>] ? dev_name_hash+0x16/0x48
Nov 17 13:16:25 ath9k kernel: [<786f653c>] ? __dev_get_by_name+0x32/0x3d
Nov 17 13:16:25 ath9k kernel: [<7876e1b4>] wext_handle_ioctl+0x8d/0x18d
Nov 17 13:16:25 ath9k kernel: [<f8bbc568>] ? cfg80211_wext_siwmlme+0x0/0x85 [cfg80211]
Nov 17 13:16:25 ath9k kernel: [<786f7669>] dev_ioctl+0x520/0x53f
Nov 17 13:16:25 ath9k kernel: [<785977bb>] ? copy_to_user+0x2f/0x108
Nov 17 13:16:25 ath9k kernel: [<786e69dc>] ? sys_recvfrom+0xb8/0xc6
Nov 17 13:16:25 ath9k kernel: [<786e5d1f>] ? sock_ioctl+0x0/0x202
Nov 17 13:16:25 ath9k kernel: [<786e5f15>] sock_ioctl+0x1f6/0x202
Nov 17 13:16:25 ath9k kernel: [<786e5d1f>] ? sock_ioctl+0x0/0x202
Nov 17 13:16:25 ath9k kernel: [<784cc071>] do_vfs_ioctl+0x56d/0x5c3
Nov 17 13:16:25 ath9k kernel: [<784c130d>] ? fcheck_files+0x9b/0xca
Nov 17 13:16:25 ath9k kernel: [<784c1369>] ? fget_light+0x2d/0xb0
Nov 17 13:16:25 ath9k kernel: [<784cc10a>] sys_ioctl+0x43/0x62
Nov 17 13:16:25 ath9k kernel: [<784030dc>] sysenter_do_call+0x12/0x38

  CTRL-A Z for help |115200 8N1 | NOR | Minicom 2.2    | VT102 | Online 03:17

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-17 21:19 recursive locking on wireless-testing Ben Greear
@ 2010-11-18  0:37 ` Felix Fietkau
  2010-11-18  0:42   ` Ben Greear
  2010-11-18  0:55   ` Ben Greear
  0 siblings, 2 replies; 7+ messages in thread
From: Felix Fietkau @ 2010-11-18  0:37 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-wireless@vger.kernel.org

On 2010-11-17 10:19 PM, Ben Greear wrote:
> I found this while testing wpa_supplicant that shares scan results.
> The kernel has no scan-sharing hacks in it..just a few patches
> I've been using for a while (and the deadlock prevention patch
> previously mentioned in other threads).
> 
> 
> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
> 
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.37-rc1-wl+ #48
> ---------------------------------------------
This should fix it. ath_tx_complete is already called with the txq locked.

--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
 	else {
 		q = skb_get_queue_mapping(skb);
 		if (txq == sc->tx.txq_map[q]) {
-			spin_lock_bh(&txq->axq_lock);
 			if (WARN_ON(--txq->pending_frames < 0))
 				txq->pending_frames = 0;
-			spin_unlock_bh(&txq->axq_lock);
 		}

 		ieee80211_tx_status(hw, skb);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-18  0:37 ` Felix Fietkau
@ 2010-11-18  0:42   ` Ben Greear
  2010-11-18  0:55   ` Ben Greear
  1 sibling, 0 replies; 7+ messages in thread
From: Ben Greear @ 2010-11-18  0:42 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless@vger.kernel.org

On 11/17/2010 04:37 PM, Felix Fietkau wrote:
> On 2010-11-17 10:19 PM, Ben Greear wrote:
>> I found this while testing wpa_supplicant that shares scan results.
>> The kernel has no scan-sharing hacks in it..just a few patches
>> I've been using for a while (and the deadlock prevention patch
>> previously mentioned in other threads).
>>
>>
>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.37-rc1-wl+ #48
>> ---------------------------------------------
> This should fix it. ath_tx_complete is already called with the txq locked.

Thanks, I'll give it a try now.

Ben

>
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
>   	else {
>   		q = skb_get_queue_mapping(skb);
>   		if (txq == sc->tx.txq_map[q]) {
> -			spin_lock_bh(&txq->axq_lock);
>   			if (WARN_ON(--txq->pending_frames<  0))
>   				txq->pending_frames = 0;
> -			spin_unlock_bh(&txq->axq_lock);
>   		}
>
>   		ieee80211_tx_status(hw, skb);


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-18  0:37 ` Felix Fietkau
  2010-11-18  0:42   ` Ben Greear
@ 2010-11-18  0:55   ` Ben Greear
  2010-11-18  9:53     ` Felix Fietkau
  1 sibling, 1 reply; 7+ messages in thread
From: Ben Greear @ 2010-11-18  0:55 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless@vger.kernel.org

On 11/17/2010 04:37 PM, Felix Fietkau wrote:
> On 2010-11-17 10:19 PM, Ben Greear wrote:
>> I found this while testing wpa_supplicant that shares scan results.
>> The kernel has no scan-sharing hacks in it..just a few patches
>> I've been using for a while (and the deadlock prevention patch
>> previously mentioned in other threads).
>>
>>
>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 2.6.37-rc1-wl+ #48
>> ---------------------------------------------
> This should fix it. ath_tx_complete is already called with the txq locked.
>
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
>   	else {
>   		q = skb_get_queue_mapping(skb);
>   		if (txq == sc->tx.txq_map[q]) {
> -			spin_lock_bh(&txq->axq_lock);
>   			if (WARN_ON(--txq->pending_frames<  0))
>   				txq->pending_frames = 0;
> -			spin_unlock_bh(&txq->axq_lock);
>   		}
>
>   		ieee80211_tx_status(hw, skb);


I restarted a few times, and haven't see any lockdep errors.  I did see the
WARN_ON inside that lock hit, however:

------------[ cut here ]------------
WARNING: at /home/greearb/git/linux.wireless-testing/drivers/net/wireless/ath/ath9k/xmit.c:1833 ath_tx_complete_buf+0x1d5/0x240 [ath9k]()
Hardware name: PDSBM
Modules linked in: aes_i586 aes_generic 8021q garp stp llc michael_mic macvlan pktgen nfs lockd fscache nfs_acl auth_rpcgss sunrpc p4_clockmod ipv6 uinput arc4 
ecb e1000e ath9k mac80211 ath9k_common ath9k_hw ath i2c_i801 cfg80211 iTCO_wdt iTCO_vendor_support pcspkr microcode i915 drm_kms_helper drm i2c_algo_bit 
i2c_core video output [last unloaded: ipt_addrtype]
Pid: 0, comm: swapper Tainted: P            2.6.37-rc2-wl+ #50
Call Trace:
  [<78436f25>] warn_slowpath_common+0x77/0x8c
  [<f86e910b>] ? ath_tx_complete_buf+0x1d5/0x240 [ath9k]
  [<f86e910b>] ? ath_tx_complete_buf+0x1d5/0x240 [ath9k]
  [<78436f57>] warn_slowpath_null+0x1d/0x1f
  [<f86e910b>] ath_tx_complete_buf+0x1d5/0x240 [ath9k]
  [<7843c24b>] ? _local_bh_enable_ip+0x9d/0xa6
  [<f86eb1bf>] ath_tx_tasklet+0x242/0x2b6 [ath9k]
  [<f86e68bc>] ath9k_tasklet+0xb9/0x127 [ath9k]
  [<7843bb0d>] tasklet_action+0x88/0xe3
  [<7843c089>] __do_softirq+0x85/0x142
  [<7843c004>] ? __do_softirq+0x0/0x142
  <IRQ>  [<7843beab>] ? irq_exit+0x35/0x69
  [<78404245>] ? do_IRQ+0x8e/0xa2
  [<7844e97c>] ? hrtimer_start+0x22/0x28
  [<784036ae>] ? common_interrupt+0x2e/0x40
  [<78408a12>] ? mwait_idle+0x59/0x69
  [<78402417>] ? cpu_idle+0x4e/0x6b
  [<78779419>] ? rest_init+0xa1/0xa7
  [<78779378>] ? rest_init+0x0/0xa7
  [<78992949>] ? start_kernel+0x334/0x33a
  [<7899244f>] ? unknown_bootoption+0x0/0x190
  [<789920e2>] ? i386_start_kernel+0xe2/0xea
---[ end trace a659d7b152ca5d4f ]---


Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-18  0:55   ` Ben Greear
@ 2010-11-18  9:53     ` Felix Fietkau
  2010-11-18 18:15       ` Ben Greear
  2010-11-19 23:12       ` Ben Greear
  0 siblings, 2 replies; 7+ messages in thread
From: Felix Fietkau @ 2010-11-18  9:53 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-wireless@vger.kernel.org

On 2010-11-18 1:55 AM, Ben Greear wrote:
> On 11/17/2010 04:37 PM, Felix Fietkau wrote:
>> On 2010-11-17 10:19 PM, Ben Greear wrote:
>>> I found this while testing wpa_supplicant that shares scan results.
>>> The kernel has no scan-sharing hacks in it..just a few patches
>>> I've been using for a while (and the deadlock prevention patch
>>> previously mentioned in other threads).
>>>
>>>
>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
>>>
>>> =============================================
>>> [ INFO: possible recursive locking detected ]
>>> 2.6.37-rc1-wl+ #48
>>> ---------------------------------------------
>> This should fix it. ath_tx_complete is already called with the txq locked.
>>
>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>> @@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
>>   	else {
>>   		q = skb_get_queue_mapping(skb);
>>   		if (txq == sc->tx.txq_map[q]) {
>> -			spin_lock_bh(&txq->axq_lock);
>>   			if (WARN_ON(--txq->pending_frames<  0))
>>   				txq->pending_frames = 0;
>> -			spin_unlock_bh(&txq->axq_lock);
>>   		}
>>
>>   		ieee80211_tx_status(hw, skb);
> 
> 
How about this instead of the other patch?

--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -163,6 +163,7 @@ static void ath_tx_flush_tid(struct ath_
 		bf = list_first_entry(&tid->buf_q, struct ath_buf, list);
 		list_move_tail(&bf->list, &bf_head);
 
+		spin_unlock_bh(&txq->axq_lock);
 		fi = get_frame_info(bf->bf_mpdu);
 		if (fi->retries) {
 			ath_tx_update_baw(sc, tid, fi->seqno);
@@ -170,6 +171,7 @@ static void ath_tx_flush_tid(struct ath_
 		} else {
 			ath_tx_send_normal(sc, txq, tid, &bf_head);
 		}
+		spin_lock_bh(&txq->axq_lock);
 	}
 
 	spin_unlock_bh(&txq->axq_lock);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-18  9:53     ` Felix Fietkau
@ 2010-11-18 18:15       ` Ben Greear
  2010-11-19 23:12       ` Ben Greear
  1 sibling, 0 replies; 7+ messages in thread
From: Ben Greear @ 2010-11-18 18:15 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless@vger.kernel.org

On 11/18/2010 01:53 AM, Felix Fietkau wrote:
> On 2010-11-18 1:55 AM, Ben Greear wrote:
>> On 11/17/2010 04:37 PM, Felix Fietkau wrote:
>>> On 2010-11-17 10:19 PM, Ben Greear wrote:
>>>> I found this while testing wpa_supplicant that shares scan results.
>>>> The kernel has no scan-sharing hacks in it..just a few patches
>>>> I've been using for a while (and the deadlock prevention patch
>>>> previously mentioned in other threads).
>>>>
>>>>
>>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
>>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
>>>>
>>>> =============================================
>>>> [ INFO: possible recursive locking detected ]
>>>> 2.6.37-rc1-wl+ #48
>>>> ---------------------------------------------
>>> This should fix it. ath_tx_complete is already called with the txq locked.
>>>
>>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>>> @@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
>>>    	else {
>>>    		q = skb_get_queue_mapping(skb);
>>>    		if (txq == sc->tx.txq_map[q]) {
>>> -			spin_lock_bh(&txq->axq_lock);
>>>    			if (WARN_ON(--txq->pending_frames<   0))
>>>    				txq->pending_frames = 0;
>>> -			spin_unlock_bh(&txq->axq_lock);
>>>    		}
>>>
>>>    		ieee80211_tx_status(hw, skb);
>>
>>
> How about this instead of the other patch?
>
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -163,6 +163,7 @@ static void ath_tx_flush_tid(struct ath_
>   		bf = list_first_entry(&tid->buf_q, struct ath_buf, list);
>   		list_move_tail(&bf->list,&bf_head);
>
> +		spin_unlock_bh(&txq->axq_lock);
>   		fi = get_frame_info(bf->bf_mpdu);
>   		if (fi->retries) {
>   			ath_tx_update_baw(sc, tid, fi->seqno);
> @@ -170,6 +171,7 @@ static void ath_tx_flush_tid(struct ath_
>   		} else {
>   			ath_tx_send_normal(sc, txq, tid,&bf_head);
>   		}
> +		spin_lock_bh(&txq->axq_lock);
>   	}
>
>   	spin_unlock_bh(&txq->axq_lock);

I'll give this a try later.  Overnight my ath9k box started spitting endless ath9k TX DMA
errors and it seems to have corrupted the / file-system or disk again:

[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
dracut: Starting plymouth daemon
Gdracut: rd_NO_DM: removing DM RAID activation
dracut: rd_NO_MD: removing MD RAID activation
input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2
G

No root device found
GG

No root device found

Boot has failed, sleeping forever.



Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: recursive locking on wireless-testing.
  2010-11-18  9:53     ` Felix Fietkau
  2010-11-18 18:15       ` Ben Greear
@ 2010-11-19 23:12       ` Ben Greear
  1 sibling, 0 replies; 7+ messages in thread
From: Ben Greear @ 2010-11-19 23:12 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless@vger.kernel.org

On 11/18/2010 01:53 AM, Felix Fietkau wrote:
> On 2010-11-18 1:55 AM, Ben Greear wrote:
>> On 11/17/2010 04:37 PM, Felix Fietkau wrote:
>>> On 2010-11-17 10:19 PM, Ben Greear wrote:
>>>> I found this while testing wpa_supplicant that shares scan results.
>>>> The kernel has no scan-sharing hacks in it..just a few patches
>>>> I've been using for a while (and the deadlock prevention patch
>>>> previously mentioned in other threads).
>>>>
>>>>
>>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Removed STA 00:14:d1:c6:d2:54
>>>> Nov 17 13:16:25 ath9k kernel: ieee80211 wiphy0: Destroyed STA 00:14:d1:c6:d2:54
>>>>
>>>> =============================================
>>>> [ INFO: possible recursive locking detected ]
>>>> 2.6.37-rc1-wl+ #48
>>>> ---------------------------------------------
>>> This should fix it. ath_tx_complete is already called with the txq locked.
>>>
>>> --- a/drivers/net/wireless/ath/ath9k/xmit.c
>>> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
>>> @@ -1830,10 +1830,8 @@ static void ath_tx_complete(struct ath_s
>>>    	else {
>>>    		q = skb_get_queue_mapping(skb);
>>>    		if (txq == sc->tx.txq_map[q]) {
>>> -			spin_lock_bh(&txq->axq_lock);
>>>    			if (WARN_ON(--txq->pending_frames<   0))
>>>    				txq->pending_frames = 0;
>>> -			spin_unlock_bh(&txq->axq_lock);
>>>    		}
>>>
>>>    		ieee80211_tx_status(hw, skb);
>>
>>
> How about this instead of the other patch?
>
> --- a/drivers/net/wireless/ath/ath9k/xmit.c
> +++ b/drivers/net/wireless/ath/ath9k/xmit.c
> @@ -163,6 +163,7 @@ static void ath_tx_flush_tid(struct ath_
>   		bf = list_first_entry(&tid->buf_q, struct ath_buf, list);
>   		list_move_tail(&bf->list,&bf_head);
>
> +		spin_unlock_bh(&txq->axq_lock);
>   		fi = get_frame_info(bf->bf_mpdu);
>   		if (fi->retries) {
>   			ath_tx_update_baw(sc, tid, fi->seqno);
> @@ -170,6 +171,7 @@ static void ath_tx_flush_tid(struct ath_
>   		} else {
>   			ath_tx_send_normal(sc, txq, tid,&bf_head);
>   		}
> +		spin_lock_bh(&txq->axq_lock);
>   	}
>
>   	spin_unlock_bh(&txq->axq_lock);

I don't see any lockdep errors with this, but I did see this spit out:


WARNING: at /home/greearb/git/linux.wireless-testing/drivers/net/wireless/ath/ath9k/recv.c:532 ath_stoprecv+0x90/0x9a [ath9k]()
Hardware name: PDSBM
Could not stop RX, we could be confusing the DMA engine when we start RX up
Modules linked in: bluetooth aes_i586 aes_generic 8021q garp stp llc michael_mic macvlan pktgen fuse nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipv6]
Pid: 16092, comm: kworker/u:0 Tainted: P        W   2.6.37-rc2-wl+ #50
Call Trace:
  [<78436f25>] warn_slowpath_common+0x77/0x8c
  [<f91f513e>] ? ath_stoprecv+0x90/0x9a [ath9k]
  [<f91f513e>] ? ath_stoprecv+0x90/0x9a [ath9k]
  [<78436fb6>] warn_slowpath_fmt+0x2e/0x30
  [<f91f513e>] ath_stoprecv+0x90/0x9a [ath9k]
  [<f91f40dc>] ath_set_channel+0x94/0x1e8 [ath9k]
  [<f8c73767>] ? ath_hw_cycle_counters_update+0xc4/0x114 [ath]
  [<f91f4574>] ath9k_config+0x344/0x423 [ath9k]
  [<f9111aaa>] ieee80211_hw_config+0x11b/0x125 [mac80211]
  [<f9115dea>] ieee80211_scan_work+0x29e/0x3f8 [mac80211]
  [<7845a5e5>] ? trace_hardirqs_on+0xb/0xd
  [<7878ea66>] ? _raw_spin_unlock_irq+0x22/0x2b
  [<78446ecb>] ? process_one_work+0x13e/0x2bf
  [<78446f3c>] process_one_work+0x1af/0x2bf
  [<78446ecb>] ? process_one_work+0x13e/0x2bf
  [<f9115b4c>] ? ieee80211_scan_work+0x0/0x3f8 [mac80211]
  [<7844868a>] worker_thread+0xf9/0x1bf
  [<78448591>] ? worker_thread+0x0/0x1bf
  [<7844b1ba>] kthread+0x62/0x67
  [<7844b158>] ? kthread+0x0/0x67
  [<784036c6>] kernel_thread_helper+0x6/0x1a


That, or similar,  was happening before, so your patch may still be fine.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-11-19 23:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-17 21:19 recursive locking on wireless-testing Ben Greear
2010-11-18  0:37 ` Felix Fietkau
2010-11-18  0:42   ` Ben Greear
2010-11-18  0:55   ` Ben Greear
2010-11-18  9:53     ` Felix Fietkau
2010-11-18 18:15       ` Ben Greear
2010-11-19 23:12       ` Ben Greear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).