* [PATCH] wlcore: Fix BUG with clear completion on timeout
@ 2018-10-01 21:38 Tony Lindgren
[not found] ` <20181001213805.86511-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Tony Lindgren @ 2018-10-01 21:38 UTC (permalink / raw)
To: Kalle Valo
Cc: Eyal Reizer, Kishon Vijay Abraham I, Guy Mishol, Luca Coelho,
Maital Hahn, Maxim Altshul, Shahar Patury,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
linux-omap-u79uwXL29TY76Z2rM5mHXA
We do not currently clear wl->elp_compl on ELP timeout and we have bogus
lingering pointer that wlcore_irq then will try to access after recovery
is done:
BUG: spinlock bad magic on CPU#1, irq/255-wl12xx/580
...
(spin_dump) from [<c01b9344>] (do_raw_spin_lock+0xc8/0x124)
(do_raw_spin_lock) from [<c09b3970>] (_raw_spin_lock_irqsave+0x68/0x74)
(_raw_spin_lock_irqsave) from [<c01a02f0>] (complete+0x24/0x58)
(complete) from [<bf572610>] (wlcore_irq+0x48/0x17c [wlcore])
(wlcore_irq [wlcore]) from [<c01c5efc>] (irq_thread_fn+0x2c/0x64)
(irq_thread_fn) from [<c01c623c>] (irq_thread+0x148/0x290)
(irq_thread) from [<c016b4b0>] (kthread+0x160/0x17c)
(kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
...
After that the system will hang. Let's fix this by adding a flag for
recovery and moving the recovery work call to to the error handling
section.
And we want to set WL1271_FLAG_INTENDED_FW_RECOVERY and actually clear
it too in wl1271_recovery_work() and just downgrade the error to a
warning to prevent overly verbose output.
Cc: Eyal Reizer <eyalr-l0cyMroinI0@public.gmane.org>
Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
---
drivers/net/wireless/ti/wlcore/main.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ti/wlcore/main.c b/drivers/net/wireless/ti/wlcore/main.c
--- a/drivers/net/wireless/ti/wlcore/main.c
+++ b/drivers/net/wireless/ti/wlcore/main.c
@@ -957,6 +957,8 @@ static void wl1271_recovery_work(struct work_struct *work)
BUG_ON(wl->conf.recovery.bug_on_recovery &&
!test_bit(WL1271_FLAG_INTENDED_FW_RECOVERY, &wl->flags));
+ clear_bit(WL1271_FLAG_INTENDED_FW_RECOVERY, &wl->flags);
+
if (wl->conf.recovery.no_recovery) {
wl1271_info("No recovery (chosen on module load). Fw will remain stuck.");
goto out_unlock;
@@ -6710,6 +6712,7 @@ static int __maybe_unused wlcore_runtime_resume(struct device *dev)
int ret;
unsigned long start_time = jiffies;
bool pending = false;
+ bool recovery = false;
/* Nothing to do if no ELP mode requested */
if (!test_bit(WL1271_FLAG_IN_ELP, &wl->flags))
@@ -6726,7 +6729,7 @@ static int __maybe_unused wlcore_runtime_resume(struct device *dev)
ret = wlcore_raw_write32(wl, HW_ACCESS_ELP_CTRL_REG, ELPCTRL_WAKE_UP);
if (ret < 0) {
- wl12xx_queue_recovery_work(wl);
+ recovery = true;
goto err;
}
@@ -6734,11 +6737,12 @@ static int __maybe_unused wlcore_runtime_resume(struct device *dev)
ret = wait_for_completion_timeout(&compl,
msecs_to_jiffies(WL1271_WAKEUP_TIMEOUT));
if (ret == 0) {
- wl1271_error("ELP wakeup timeout!");
- wl12xx_queue_recovery_work(wl);
+ wl1271_warning("ELP wakeup timeout!");
/* Return no error for runtime PM for recovery */
- return 0;
+ ret = 0;
+ recovery = true;
+ goto err;
}
}
@@ -6753,6 +6757,12 @@ static int __maybe_unused wlcore_runtime_resume(struct device *dev)
spin_lock_irqsave(&wl->wl_lock, flags);
wl->elp_compl = NULL;
spin_unlock_irqrestore(&wl->wl_lock, flags);
+
+ if (recovery) {
+ set_bit(WL1271_FLAG_INTENDED_FW_RECOVERY, &wl->flags);
+ wl12xx_queue_recovery_work(wl);
+ }
+
return ret;
}
--
2.19.0
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <20181001213805.86511-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>]
* Re: [PATCH] wlcore: Fix BUG with clear completion on timeout [not found] ` <20181001213805.86511-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> @ 2018-10-05 8:33 ` Kalle Valo [not found] ` <20181005083324.D2D5160818-4h6buKAYkuurB/BPivuO70B+6BGkLq7r@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Kalle Valo @ 2018-10-05 8:33 UTC (permalink / raw) To: Tony Lindgren Cc: Eyal Reizer, Kishon Vijay Abraham I, Guy Mishol, Luca Coelho, Maital Hahn, Maxim Altshul, Shahar Patury, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linux-omap-u79uwXL29TY76Z2rM5mHXA Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> wrote: > We do not currently clear wl->elp_compl on ELP timeout and we have bogus > lingering pointer that wlcore_irq then will try to access after recovery > is done: > > BUG: spinlock bad magic on CPU#1, irq/255-wl12xx/580 > ... > (spin_dump) from [<c01b9344>] (do_raw_spin_lock+0xc8/0x124) > (do_raw_spin_lock) from [<c09b3970>] (_raw_spin_lock_irqsave+0x68/0x74) > (_raw_spin_lock_irqsave) from [<c01a02f0>] (complete+0x24/0x58) > (complete) from [<bf572610>] (wlcore_irq+0x48/0x17c [wlcore]) > (wlcore_irq [wlcore]) from [<c01c5efc>] (irq_thread_fn+0x2c/0x64) > (irq_thread_fn) from [<c01c623c>] (irq_thread+0x148/0x290) > (irq_thread) from [<c016b4b0>] (kthread+0x160/0x17c) > (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > ... > > After that the system will hang. Let's fix this by adding a flag for > recovery and moving the recovery work call to to the error handling > section. > > And we want to set WL1271_FLAG_INTENDED_FW_RECOVERY and actually clear > it too in wl1271_recovery_work() and just downgrade the error to a > warning to prevent overly verbose output. > > Cc: Eyal Reizer <eyalr-l0cyMroinI0@public.gmane.org> > Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> Patch applied to wireless-drivers-next.git, thanks. 4e651bad8489 wlcore: Fix BUG with clear completion on timeout -- https://patchwork.kernel.org/patch/10622767/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20181005083324.D2D5160818-4h6buKAYkuurB/BPivuO70B+6BGkLq7r@public.gmane.org>]
* Re: [PATCH] wlcore: Fix BUG with clear completion on timeout [not found] ` <20181005083324.D2D5160818-4h6buKAYkuurB/BPivuO70B+6BGkLq7r@public.gmane.org> @ 2018-11-30 13:16 ` Adam Ford [not found] ` <CAHCN7xKiCrrG=YLMX4gTtgfEQ3zG3P++OpG5aaQ3uoFO9A5RSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Adam Ford @ 2018-11-30 13:16 UTC (permalink / raw) To: kvalo-sgV2jX0FEOL9JmXXK+q4OQ Cc: Tony Lindgren, Reizer, Eyal, Kishon Vijay Abraham I, guym-l0cyMroinI0, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, maitalm-l0cyMroinI0, maxim.altshul-l0cyMroinI0, shaharp-l0cyMroinI0, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linux-omap-u79uwXL29TY76Z2rM5mHXA On Fri, Oct 5, 2018 at 3:33 AM Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > > Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> wrote: > > > We do not currently clear wl->elp_compl on ELP timeout and we have bogus > > lingering pointer that wlcore_irq then will try to access after recovery > > is done: > > > > BUG: spinlock bad magic on CPU#1, irq/255-wl12xx/580 > > ... > > (spin_dump) from [<c01b9344>] (do_raw_spin_lock+0xc8/0x124) > > (do_raw_spin_lock) from [<c09b3970>] (_raw_spin_lock_irqsave+0x68/0x74) > > (_raw_spin_lock_irqsave) from [<c01a02f0>] (complete+0x24/0x58) > > (complete) from [<bf572610>] (wlcore_irq+0x48/0x17c [wlcore]) > > (wlcore_irq [wlcore]) from [<c01c5efc>] (irq_thread_fn+0x2c/0x64) > > (irq_thread_fn) from [<c01c623c>] (irq_thread+0x148/0x290) > > (irq_thread) from [<c016b4b0>] (kthread+0x160/0x17c) > > (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > > ... > > > > After that the system will hang. Let's fix this by adding a flag for > > recovery and moving the recovery work call to to the error handling > > section. > > > > And we want to set WL1271_FLAG_INTENDED_FW_RECOVERY and actually clear > > it too in wl1271_recovery_work() and just downgrade the error to a > > warning to prevent overly verbose output. > > Do we know how far back this bug goes and which versions need this patch applied to it? I have seen something similar on 4.19, but I haven't tried this patch to fix it. It wasn't clear to me if this is linux-next or 4.19 or something different. thanks adam > > Cc: Eyal Reizer <eyalr-l0cyMroinI0@public.gmane.org> > > Signed-off-by: Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> > > Patch applied to wireless-drivers-next.git, thanks. > > 4e651bad8489 wlcore: Fix BUG with clear completion on timeout > > -- > https://patchwork.kernel.org/patch/10622767/ > > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches > ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <CAHCN7xKiCrrG=YLMX4gTtgfEQ3zG3P++OpG5aaQ3uoFO9A5RSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] wlcore: Fix BUG with clear completion on timeout [not found] ` <CAHCN7xKiCrrG=YLMX4gTtgfEQ3zG3P++OpG5aaQ3uoFO9A5RSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-11-30 18:32 ` Tony Lindgren 0 siblings, 0 replies; 4+ messages in thread From: Tony Lindgren @ 2018-11-30 18:32 UTC (permalink / raw) To: Adam Ford Cc: kvalo-sgV2jX0FEOL9JmXXK+q4OQ, Reizer, Eyal, Kishon Vijay Abraham I, guym-l0cyMroinI0, luciano.coelho-ral2JQCrhuEAvxtiuMwx3w, maitalm-l0cyMroinI0, maxim.altshul-l0cyMroinI0, shaharp-l0cyMroinI0, linux-wireless-u79uwXL29TY76Z2rM5mHXA, linux-omap-u79uwXL29TY76Z2rM5mHXA Hi, * Adam Ford <aford173-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> [181130 13:16]: > On Fri, Oct 5, 2018 at 3:33 AM Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > > > > Tony Lindgren <tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org> wrote: > > > > > We do not currently clear wl->elp_compl on ELP timeout and we have bogus > > > lingering pointer that wlcore_irq then will try to access after recovery > > > is done: > > > > > > BUG: spinlock bad magic on CPU#1, irq/255-wl12xx/580 > > > ... > > > (spin_dump) from [<c01b9344>] (do_raw_spin_lock+0xc8/0x124) > > > (do_raw_spin_lock) from [<c09b3970>] (_raw_spin_lock_irqsave+0x68/0x74) > > > (_raw_spin_lock_irqsave) from [<c01a02f0>] (complete+0x24/0x58) > > > (complete) from [<bf572610>] (wlcore_irq+0x48/0x17c [wlcore]) > > > (wlcore_irq [wlcore]) from [<c01c5efc>] (irq_thread_fn+0x2c/0x64) > > > (irq_thread_fn) from [<c01c623c>] (irq_thread+0x148/0x290) > > > (irq_thread) from [<c016b4b0>] (kthread+0x160/0x17c) > > > (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20) > > > ... > > > > > > After that the system will hang. Let's fix this by adding a flag for > > > recovery and moving the recovery work call to to the error handling > > > section. > > > > > > And we want to set WL1271_FLAG_INTENDED_FW_RECOVERY and actually clear > > > it too in wl1271_recovery_work() and just downgrade the error to a > > > warning to prevent overly verbose output. > > > > > Do we know how far back this bug goes and which versions need this > patch applied to it? I have seen something similar on 4.19, but I > haven't tried this patch to fix it. It wasn't clear to me if this is > linux-next or 4.19 or something different. I'm not sure if this is needed for v4.19 as the wakeirq patch is not there. Maybe give it a try and see if it helps with the issue you're seeing, then request inclusion for stable if it helps? BTW any wlcore issues with earlier kernels should be separately debugged and tested. Fixes done after changing wlcore to use PM runtime and wakeirq may be incomple for earlier kernels, that's the two commits and below and any changes related to them. And in general there seems to be two categories of common issues with wlcore that I've seen: GPIO interrupt not behaving with the SoC or old firmware being used for wlcore. Regards, Tony 8< ----------------- 3c83dd577c7f ("wlcore: Add support for optional wakeirq") fa2648a34e73 ("wlcore: Add support for runtime PM") ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-11-30 18:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-01 21:38 [PATCH] wlcore: Fix BUG with clear completion on timeout Tony Lindgren
[not found] ` <20181001213805.86511-1-tony-4v6yS6AI5VpBDgjK7y7TUQ@public.gmane.org>
2018-10-05 8:33 ` Kalle Valo
[not found] ` <20181005083324.D2D5160818-4h6buKAYkuurB/BPivuO70B+6BGkLq7r@public.gmane.org>
2018-11-30 13:16 ` Adam Ford
[not found] ` <CAHCN7xKiCrrG=YLMX4gTtgfEQ3zG3P++OpG5aaQ3uoFO9A5RSA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-11-30 18:32 ` Tony Lindgren
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).