* Deadlock in mac80211 running 3.18.11 + compat-wireless 2015-03-09
@ 2015-10-06 13:36 Michael Braun
2015-10-06 13:42 ` Johannes Berg
0 siblings, 1 reply; 2+ messages in thread
From: Michael Braun @ 2015-10-06 13:36 UTC (permalink / raw)
To: johannes; +Cc: projekt-wlan, linux-wireless
[-- Attachment #1: Type: text/plain, Size: 1620 bytes --]
Dear Maintainer,
I'm running a custom version of OpenWRT based on Linux 3.18.11 with
compat-wireless 2015-03-09 and am sometimes experiencing deadlock
warnings on P1020WLAN (PPC).
Thought, I think I have reason to believe that the modifications do not
affect the deadlock; I currently do not have to opportunity to test an
unmodified kernel.
Please find a backtrace attached. It makes more sense when replacing
minstrel_remove_sta_debugfs with minstrel_ht_get_rate ->
minstrel_aggr_check and
ieee80211_proberesp_get, ieee80211_get_buffered_bc with ieee80211_xmit
and ieee80211_tx_h_rate_ctrl.
This occurs while running infrastructure (AP) mode and IBSS
simultaneously.
The important part (stack):
CPU 0
1. ieee80211_subif_start_xmit
2. ieee80211_get_buffered_bc
3. ieee80211_proberesp_get
4. rate_control_get_rate -> acquires sta->rate_lock
5. minstrel_ht_get_rate
6. minstrel_aggr_check
7. ieee80211_start_tx_ba_session -> wait for sta->lock
CPU 1
1. ieee80211_ibss_leave
2. ieee80211_stop_tx_ba_cb -> acquires sta->lock
3. ieee80211_send_delba
4. ieee80211_tx_skb
5. ieee80211_tx_skb_tid
6. __ieee80211_tx_skb_tid_band
7. ieee80211_xmit
8. ieee80211_tx
9. invoke_tx_handlers
10. ieee80211_tx_h_rate_ctrl
11. rate_control_get_rate -> wait for sta->rate_lock
I'm unsure how to address this. Replacing sta->rate_lock with sta->lock
breaks due to spinlock nesting and might be overkill.
If there are patches I could test them.
If you believe that this is not valid upstream issue, please let me
know.
Thanks,
M. Braun
[-- Attachment #2: deadlock-trace.txt --]
[-- Type: text/plain, Size: 6231 bytes --]
[ 9750.975999] ======================================================
[ 9750.982171] [ INFO: possible circular locking dependency detected ]
[ 9750.988434] 3.18.11 #2 Not tainted
[ 9750.991828] -------------------------------------------------------
[ 9750.998088] kworker/u4:0/4018 is trying to acquire lock:
[ 9751.003392] (&(&sta->rate_ctrl_lock)->rlock){+.-...}, at: [<f39709d4>] rate_control_get_rate+0xb0/0x148 [mac80211]
[ 9751.013905]
but task is already holding lock:
[ 9751.019731] (&(&sta->lock)->rlock){+.-...}, at: [<f3967370>] ieee80211_stop_tx_ba_cb+0x70/0x190 [mac80211]
[ 9751.029496]
which lock already depends on the new lock.
[ 9751.037669]
the existing dependency chain (in reverse order) is:
[ 9751.045144]
-> #1 (&(&sta->lock)->rlock){+.-...}:
[ 9751.050030] [<c035a558>] _raw_spin_lock_bh+0x44/0x5c
[ 9751.055615] [<f3966558>] ieee80211_start_tx_ba_session+0xd0/0x268 [mac80211]
[ 9751.063281] [<f39a9620>] minstrel_remove_sta_debugfs+0x784/0x1c68 [mac80211]
[ 9751.070948] [<f39709f4>] rate_control_get_rate+0xd0/0x148 [mac80211]
[ 9751.077921] [<f3980338>] ieee80211_proberesp_get+0x1154/0x2c48 [mac80211]
[ 9751.085328] [<f3982a64>] ieee80211_get_buffered_bc+0x234/0x268 [mac80211]
[ 9751.092735] [<f3983640>] __ieee80211_subif_start_xmit+0x354/0x3f4 [mac80211]
[ 9751.100402] [<f39836f4>] ieee80211_subif_start_xmit+0x14/0x28 [mac80211]
[ 9751.107722] [<c02b7650>] dev_hard_start_xmit+0x2d4/0x38c
[ 9751.113646] [<c02d241c>] sch_direct_xmit+0x98/0x200
[ 9751.119135] [<c02b7a64>] __dev_queue_xmit+0x35c/0x62c
[ 9751.124793] [<f33b3320>] br_dev_queue_push_xmit+0x1e8/0x230 [bridge]
[ 9751.131797] [<f33b3508>] br_deliver+0x90/0x328 [bridge]
[ 9751.137638] [<f33b4788>] br_handle_frame_finish+0x150/0x1d0 [bridge]
[ 9751.144606] [<f33b4ac4>] br_handle_frame+0x2bc/0xc24 [bridge]
[ 9751.150965] [<c02b22fc>] __netif_receive_skb_core+0x548/0x7c0
[ 9751.157319] [<c02b5924>] process_backlog+0xa4/0x178
[ 9751.162805] [<c02b5114>] net_rx_action+0x94/0x1a4
[ 9751.168116] [<c002c6f4>] __do_softirq+0x100/0x244
[ 9751.173433] [<c000cc5c>] call_do_softirq+0x24/0x3c
[ 9751.178832] [<c00048a4>] do_softirq_own_stack+0x44/0x7c
[ 9751.184667] [<c002c91c>] do_softirq+0x58/0x94
[ 9751.189630] [<c02b1c44>] netif_rx_ni+0x48/0x60
[ 9751.194680] [<f317f210>] tun_get_socket+0x10a4/0x37ac [tun]
[ 9751.200885] [<f317f5ac>] tun_get_socket+0x1440/0x37ac [tun]
[ 9751.207071] [<c00dab90>] do_sync_write+0x70/0xa4
[ 9751.212303] [<c00db7c8>] vfs_write+0xb0/0x1bc
[ 9751.217267] [<c00dbd1c>] SyS_write+0x4c/0xa4
[ 9751.222144] [<c000e4e8>] ret_from_syscall+0x0/0x3c
[ 9751.227544]
-> #0 (&(&sta->rate_ctrl_lock)->rlock){+.-...}:
[ 9751.233300] [<c0060fb0>] lock_acquire+0x50/0x6c
[ 9751.238439] [<c035a558>] _raw_spin_lock_bh+0x44/0x5c
[ 9751.244015] [<f39709d4>] rate_control_get_rate+0xb0/0x148 [mac80211]
[ 9751.251004] [<f3980338>] ieee80211_proberesp_get+0x1154/0x2c48 [mac80211]
[ 9751.258411] [<f3982a64>] ieee80211_get_buffered_bc+0x234/0x268 [mac80211]
[ 9751.265818] [<f3983ae8>] __ieee80211_tx_skb_tid_band+0x7c/0x728 [mac80211]
[ 9751.273311] [<f39659bc>] ieee80211_send_delba+0x2bc/0x2e4 [mac80211]
[ 9751.280283] [<f39673b8>] ieee80211_stop_tx_ba_cb+0xb8/0x190 [mac80211]
[ 9751.287427] [<f396c9d4>] ieee80211_ibss_leave+0xb08/0x17d0 [mac80211]
[ 9751.294484] [<c003fa98>] process_one_work+0x210/0x3a0
[ 9751.300144] [<c0040154>] worker_thread+0x270/0x428
[ 9751.305541] [<c0044194>] kthread+0xd0/0xd4
[ 9751.310248] [<c000e634>] ret_from_kernel_thread+0x5c/0x64
[ 9751.316255]
other info that might help us debug this:
[ 9751.324255] Possible unsafe locking scenario:
[ 9751.330169] CPU0 CPU1
[ 9751.334691] ---- ----
[ 9751.339212] lock(&(&sta->lock)->rlock);
[ 9751.343221] lock(&(&sta->rate_ctrl_lock)->rlock);
[ 9751.350615] lock(&(&sta->lock)->rlock);
[ 9751.357142] lock(&(&sta->rate_ctrl_lock)->rlock);
[ 9751.362019]
*** DEADLOCK ***
[ 9751.367938] 6 locks held by kworker/u4:0/4018:
[ 9751.372374] #0: ("%s"wiphy_name(local->hw.wiphy)){++++.+}, at: [<c003fa28>] process_one_work+0x1a0/0x3a0
[ 9751.382045] #1: ((&sdata->work)){+.+.+.}, at: [<c003fa28>] process_one_work+0x1a0/0x3a0
[ 9751.390239] #2: (&local->sta_mtx){+.+.+.}, at: [<f3967338>] ieee80211_stop_tx_ba_cb+0x38/0x190 [mac80211]
[ 9751.400009] #3: (&sta->ampdu_mlme.mtx){+.+.+.}, at: [<f3967364>] ieee80211_stop_tx_ba_cb+0x64/0x190 [mac80211]
[ 9751.410209] #4: (&(&sta->lock)->rlock){+.-...}, at: [<f3967370>] ieee80211_stop_tx_ba_cb+0x70/0x190 [mac80211]
[ 9751.420413] #5: (rcu_read_lock){......}, at: [<f39658f4>] ieee80211_send_delba+0x1f4/0x2e4 [mac80211]
[ 9751.429833]
stack backtrace:
[ 9751.434190] CPU: 0 PID: 4018 Comm: kworker/u4:0 Not tainted 3.18.11 #2
[ 9751.440726] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[ 9751.446292] [stack] Call Trace:
[ 9751.449434] [stack] [ecebdb30] [c035e054] dump_stack+0x78/0xa0 (unreliable)
[ 9751.456398] [stack] [ecebdb40] [c035c430] print_circular_bug+0x320/0x338
[ 9751.463098] [stack] [ecebdb70] [c00600cc] __lock_acquire+0x11ac/0x19cc
[ 9751.469624] [stack] [ecebdc00] [c0060fb0] lock_acquire+0x50/0x6c
[ 9751.475633] [stack] [ecebdc20] [c035a558] _raw_spin_lock_bh+0x44/0x5c
[ 9751.482085] [stack] [ecebdc30] [f39709d4] rate_control_get_rate+0xb0/0x148 [mac80211]
[ 9751.489927] [stack] [ecebdc60] [f3980338] ieee80211_proberesp_get+0x1154/0x2c48 [mac80211]
[ 9751.498203] [stack] [ecebdcf0] [f3982a64] ieee80211_get_buffered_bc+0x234/0x268 [mac80211]
[ 9751.506477] [stack] [ecebdd60] [f3983ae8] __ieee80211_tx_skb_tid_band+0x7c/0x728 [mac80211]
[ 9751.514837] [stack] [ecebdd80] [f39659bc] ieee80211_send_delba+0x2bc/0x2e4 [mac80211]
[ 9751.522678] [stack] [ecebddb0] [f39673b8] ieee80211_stop_tx_ba_cb+0xb8/0x190 [mac80211]
[ 9751.530691] [stack] [ecebdde0] [f396c9d4] ieee80211_ibss_leave+0xb08/0x17d0 [mac80211]
[ 9751.538607] [stack] [ecebde40] [c003fa98] process_one_work+0x210/0x3a0
[ 9751.545134] [stack] [ecebde70] [c0040154] worker_thread+0x270/0x428
[ 9751.551402] [stack] [ecebdeb0] [c0044194] kthread+0xd0/0xd4
[ 9751.556977] [stack] [ecebdf40] [c000e634] ret_from_kernel_thread+0x5c/0x64
[ 9751.563848] [stack] --- interrupt: 0 at (null)
[stack] LR = (null)
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Deadlock in mac80211 running 3.18.11 + compat-wireless 2015-03-09
2015-10-06 13:36 Deadlock in mac80211 running 3.18.11 + compat-wireless 2015-03-09 Michael Braun
@ 2015-10-06 13:42 ` Johannes Berg
0 siblings, 0 replies; 2+ messages in thread
From: Johannes Berg @ 2015-10-06 13:42 UTC (permalink / raw)
To: projekt-wlan, linux-wireless
commit 2c158887f1185e04b3763ae346da9f71fcbc4429
Author: Johannes Berg <johannes.berg@intel.com>
Date: Thu Mar 12 19:28:31 2015 +0100
mac80211: agg-tx: avoid sending DelBA with sta->lock held
The rate control locking caused a potential deadlock here due to
the
locks being acquired in different orders, so that change cannot yet
be applied. However, there's no fundamental reason for this code to
hold the sta->lock while transmitting frames.
Clearly it's better not to hold the lock for longer periods of
time,
which can happen here since we call all the way down to the driver.
Change the code a bit to not hold it while doing that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-10-06 13:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-06 13:36 Deadlock in mac80211 running 3.18.11 + compat-wireless 2015-03-09 Michael Braun
2015-10-06 13:42 ` Johannes Berg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).