* [PATCH wireless] ar5523: Fix deadlock bugs caused by cancel_work_sync in ar5523_stop
@ 2022-05-22 13:30 Duoming Zhou
2022-05-30 11:24 ` Kalle Valo
0 siblings, 1 reply; 3+ messages in thread
From: Duoming Zhou @ 2022-05-22 13:30 UTC (permalink / raw)
To: linux-wireless
Cc: pontus.fuchs, kvalo, davem, edumazet, kuba, pabeni, netdev,
linux-kernel, Duoming Zhou
If the work item is running, the cancel_work_sync in ar5523_stop will
not return until work item is finished. If we hold mutex_lock and use
cancel_work_sync to wait the work item to finish, the work item such as
ar5523_tx_wd_work and ar5523_tx_work also require mutex_lock. As a result,
the ar5523_stop will be blocked forever. One of the race conditions is
shown below:
(Thread 1) | (Thread 2)
ar5523_stop |
mutex_lock(&ar->mutex) | ar5523_tx_wd_work
| mutex_lock(&ar->mutex)
cancel_work_sync | ...
This patch moves cancel_work_sync out of mutex_lock in order to mitigate
deadlock bugs.
Fixes: b7d572e1871d ("ar5523: Add new driver")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
---
drivers/net/wireless/ath/ar5523/ar5523.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c
index 9cabd342d15..99d6b13ffcf 100644
--- a/drivers/net/wireless/ath/ar5523/ar5523.c
+++ b/drivers/net/wireless/ath/ar5523/ar5523.c
@@ -1071,8 +1071,10 @@ static void ar5523_stop(struct ieee80211_hw *hw)
ar5523_cmd_write(ar, WDCMSG_TARGET_STOP, NULL, 0, 0);
del_timer_sync(&ar->tx_wd_timer);
+ mutex_unlock(&ar->mutex);
cancel_work_sync(&ar->tx_wd_work);
cancel_work_sync(&ar->rx_refill_work);
+ mutex_lock(&ar->mutex);
ar5523_cancel_rx_bufs(ar);
mutex_unlock(&ar->mutex);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH wireless] ar5523: Fix deadlock bugs caused by cancel_work_sync in ar5523_stop 2022-05-22 13:30 [PATCH wireless] ar5523: Fix deadlock bugs caused by cancel_work_sync in ar5523_stop Duoming Zhou @ 2022-05-30 11:24 ` Kalle Valo 2022-05-31 7:50 ` duoming 0 siblings, 1 reply; 3+ messages in thread From: Kalle Valo @ 2022-05-30 11:24 UTC (permalink / raw) To: Duoming Zhou Cc: linux-wireless, pontus.fuchs, davem, edumazet, kuba, pabeni, netdev, linux-kernel Duoming Zhou <duoming@zju.edu.cn> writes: > If the work item is running, the cancel_work_sync in ar5523_stop will > not return until work item is finished. If we hold mutex_lock and use > cancel_work_sync to wait the work item to finish, the work item such as > ar5523_tx_wd_work and ar5523_tx_work also require mutex_lock. As a result, > the ar5523_stop will be blocked forever. One of the race conditions is > shown below: > > (Thread 1) | (Thread 2) > ar5523_stop | > mutex_lock(&ar->mutex) | ar5523_tx_wd_work > | mutex_lock(&ar->mutex) > cancel_work_sync | ... > > This patch moves cancel_work_sync out of mutex_lock in order to mitigate > deadlock bugs. > > Fixes: b7d572e1871d ("ar5523: Add new driver") > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> I assume you have found this with a static checker tool, it would be good document what tool you are using. And if you have not tested this with real hardware clearly mention that with "Compile tested only". > --- > drivers/net/wireless/ath/ar5523/ar5523.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c > index 9cabd342d15..99d6b13ffcf 100644 > --- a/drivers/net/wireless/ath/ar5523/ar5523.c > +++ b/drivers/net/wireless/ath/ar5523/ar5523.c > @@ -1071,8 +1071,10 @@ static void ar5523_stop(struct ieee80211_hw *hw) > ar5523_cmd_write(ar, WDCMSG_TARGET_STOP, NULL, 0, 0); > > del_timer_sync(&ar->tx_wd_timer); > + mutex_unlock(&ar->mutex); > cancel_work_sync(&ar->tx_wd_work); > cancel_work_sync(&ar->rx_refill_work); > + mutex_lock(&ar->mutex); > ar5523_cancel_rx_bufs(ar); > mutex_unlock(&ar->mutex); > } Releasing a lock and taking it again looks like a hack to me. Please test with a real device and try to find a better solution. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH wireless] ar5523: Fix deadlock bugs caused by cancel_work_sync in ar5523_stop 2022-05-30 11:24 ` Kalle Valo @ 2022-05-31 7:50 ` duoming 0 siblings, 0 replies; 3+ messages in thread From: duoming @ 2022-05-31 7:50 UTC (permalink / raw) To: Kalle Valo Cc: linux-wireless, pontus.fuchs, davem, edumazet, kuba, pabeni, netdev, linux-kernel Hello, On Mon, 30 May 2022 14:24:04 +0300 Kalle Valo wrote: > Duoming Zhou <duoming@zju.edu.cn> writes: > > > If the work item is running, the cancel_work_sync in ar5523_stop will > > not return until work item is finished. If we hold mutex_lock and use > > cancel_work_sync to wait the work item to finish, the work item such as > > ar5523_tx_wd_work and ar5523_tx_work also require mutex_lock. As a result, > > the ar5523_stop will be blocked forever. One of the race conditions is > > shown below: > > > > (Thread 1) | (Thread 2) > > ar5523_stop | > > mutex_lock(&ar->mutex) | ar5523_tx_wd_work > > | mutex_lock(&ar->mutex) > > cancel_work_sync | ... > > > > This patch moves cancel_work_sync out of mutex_lock in order to mitigate > > deadlock bugs. > > > > Fixes: b7d572e1871d ("ar5523: Add new driver") > > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > > I assume you have found this with a static checker tool, it would be > good document what tool you are using. And if you have not tested this > with real hardware clearly mention that with "Compile tested only". > > > --- > > drivers/net/wireless/ath/ar5523/ar5523.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c > > index 9cabd342d15..99d6b13ffcf 100644 > > --- a/drivers/net/wireless/ath/ar5523/ar5523.c > > +++ b/drivers/net/wireless/ath/ar5523/ar5523.c > > @@ -1071,8 +1071,10 @@ static void ar5523_stop(struct ieee80211_hw *hw) > > ar5523_cmd_write(ar, WDCMSG_TARGET_STOP, NULL, 0, 0); > > > > del_timer_sync(&ar->tx_wd_timer); > > + mutex_unlock(&ar->mutex); > > cancel_work_sync(&ar->tx_wd_work); > > cancel_work_sync(&ar->rx_refill_work); > > + mutex_lock(&ar->mutex); > > ar5523_cancel_rx_bufs(ar); > > mutex_unlock(&ar->mutex); > > } > > Releasing a lock and taking it again looks like a hack to me. Please > test with a real device and try to find a better solution. The following is a new solution: diff --git a/drivers/net/wireless/ath/ar5523/ar5523.c b/drivers/net/wireless/ath/ar5523/ar5523.c index 9cabd342d15..8adae85fcb9 100644 --- a/drivers/net/wireless/ath/ar5523/ar5523.c +++ b/drivers/net/wireless/ath/ar5523/ar5523.c @@ -910,7 +910,11 @@ static void ar5523_tx_wd_work(struct work_struct *work) * recover seems to be to reset the dongle. */ - mutex_lock(&ar->mutex); + if(!mutex_trylock(&ar->mutex)) { + if(test_bit(AR5523_HW_UP, &ar->flags)) + ieee80211_queue_work(ar->hw, &ar->tx_wd_work); + return; + } ar5523_err(ar, "TX queue stuck (tot %d pend %d)\n", atomic_read(&ar->tx_nr_total), atomic_read(&ar->tx_nr_pending)); If ar5523_stop() has acquired "ar->mutex" lock, the ar5523_tx_wd_work() will directly return. If "ar->mutex" lock has acquired by other functions except ar5523_stop(), ar5523_tx_wd_work() will re-queue itself. So, this solution could mitigate the deadlock between ar5523_stop() and ar5523_tx_wd_work(). Best regards, Duoming Zhou ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-05-31 7:51 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-05-22 13:30 [PATCH wireless] ar5523: Fix deadlock bugs caused by cancel_work_sync in ar5523_stop Duoming Zhou 2022-05-30 11:24 ` Kalle Valo 2022-05-31 7:50 ` duoming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).