* [PATCH] wl12xx: trigger recovery for failures in sched scan stop @ 2011-12-21 12:06 Eyal Shapira 2011-12-22 8:59 ` Eliad Peller 0 siblings, 1 reply; 4+ messages in thread From: Eyal Shapira @ 2011-12-21 12:06 UTC (permalink / raw) To: Luciano Coelho; +Cc: linux-wireless mac80211 requires that drv_sched_scan_stop will always succeed. We have a few possible errors such as OOM, failure to ELP wakeup, fail in the FW command. Instead of no-op , trigger a recovery which would mark sched scan as stopped and clear up any bad FW state. Signed-off-by: Eyal Shapira <eyal@wizery.com> --- drivers/net/wireless/wl12xx/main.c | 4 +++- drivers/net/wireless/wl12xx/scan.c | 14 +++++++++----- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/wl12xx/main.c b/drivers/net/wireless/wl12xx/main.c index d5f55a1..0d3de3e 100644 --- a/drivers/net/wireless/wl12xx/main.c +++ b/drivers/net/wireless/wl12xx/main.c @@ -3137,8 +3137,10 @@ static void wl1271_op_sched_scan_stop(struct ieee80211_hw *hw, mutex_lock(&wl->mutex); ret = wl1271_ps_elp_wakeup(wl); - if (ret < 0) + if (ret < 0) { + wl12xx_queue_recovery_work(wl); goto out; + } wl1271_scan_sched_scan_stop(wl); diff --git a/drivers/net/wireless/wl12xx/scan.c b/drivers/net/wireless/wl12xx/scan.c index e24111e..108bed4 100644 --- a/drivers/net/wireless/wl12xx/scan.c +++ b/drivers/net/wireless/wl12xx/scan.c @@ -739,22 +739,26 @@ void wl1271_scan_sched_scan_stop(struct wl1271 *wl) wl1271_debug(DEBUG_CMD, "cmd periodic scan stop"); - /* FIXME: what to do if alloc'ing to stop fails? */ stop = kzalloc(sizeof(*stop), GFP_KERNEL); if (!stop) { wl1271_error("failed to alloc memory to send sched scan stop"); - return; + goto recovery; } stop->tag = WL1271_SCAN_DEFAULT_TAG; ret = wl1271_cmd_send(wl, CMD_STOP_PERIODIC_SCAN, stop, sizeof(*stop), 0); + + kfree(stop); + if (ret < 0) { wl1271_error("failed to send sched scan stop command"); - goto out_free; + goto recovery; } -out_free: - kfree(stop); + return; + +recovery: + wl12xx_queue_recovery_work(wl); } -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] wl12xx: trigger recovery for failures in sched scan stop 2011-12-21 12:06 [PATCH] wl12xx: trigger recovery for failures in sched scan stop Eyal Shapira @ 2011-12-22 8:59 ` Eliad Peller [not found] ` <CAF8O4-zh25pd7T937tbmRcFgmTVpZPtJbFP8OJKhgcaeSSqaSw@mail.gmail.com> 0 siblings, 1 reply; 4+ messages in thread From: Eliad Peller @ 2011-12-22 8:59 UTC (permalink / raw) To: Eyal Shapira; +Cc: Luciano Coelho, linux-wireless On Wed, Dec 21, 2011 at 2:06 PM, Eyal Shapira <eyal@wizery.com> wrote: > mac80211 requires that drv_sched_scan_stop > will always succeed. We have a few possible errors > such as OOM, failure to ELP wakeup, fail in the FW command. > Instead of no-op , trigger a recovery which would > mark sched scan as stopped and clear up any bad FW state. > > Signed-off-by: Eyal Shapira <eyal@wizery.com> > --- this patch seems a bit redundant. > ret = wl1271_ps_elp_wakeup(wl); > - if (ret < 0) > + if (ret < 0) { > + wl12xx_queue_recovery_work(wl); > goto out; > + } > we enqueue recovery in wl1271_ps_elp_wakeup() in most error paths. > - /* FIXME: what to do if alloc'ing to stop fails? */ > stop = kzalloc(sizeof(*stop), GFP_KERNEL); > if (!stop) { > wl1271_error("failed to alloc memory to send sched scan stop"); > - return; > + goto recovery; > } not sure if initiating recovery here will really help. it'll probably fail as well, and good chances we are going to panic soon anyway :) > ret = wl1271_cmd_send(wl, CMD_STOP_PERIODIC_SCAN, stop, > sizeof(*stop), 0); > + > + kfree(stop); > + > if (ret < 0) { > wl1271_error("failed to send sched scan stop command"); > - goto out_free; > + goto recovery; > } > wl1271_cmd_send enqueues recovery work on error as well. anyway, the major thing i don't like here is handling the sched_stop in a different manner than the rest of the commands. Eliad. ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <CAF8O4-zh25pd7T937tbmRcFgmTVpZPtJbFP8OJKhgcaeSSqaSw@mail.gmail.com>]
* Re: [PATCH] wl12xx: trigger recovery for failures in sched scan stop [not found] ` <CAF8O4-zh25pd7T937tbmRcFgmTVpZPtJbFP8OJKhgcaeSSqaSw@mail.gmail.com> @ 2011-12-23 1:11 ` Eyal Shapira 2011-12-23 9:13 ` Luciano Coelho 0 siblings, 1 reply; 4+ messages in thread From: Eyal Shapira @ 2011-12-23 1:11 UTC (permalink / raw) To: Eliad Peller; +Cc: Luciano Coelho, linux-wireless On Fri, Dec 23, 2011 at 3:07 AM, Eyal Shapira <eyal@wizery.com> wrote: > > > > On Thu, Dec 22, 2011 at 10:59 AM, Eliad Peller <eliad@wizery.com> wrote: >> >> On Wed, Dec 21, 2011 at 2:06 PM, Eyal Shapira <eyal@wizery.com> wrote: >> > mac80211 requires that drv_sched_scan_stop >> > will always succeed. We have a few possible errors >> > such as OOM, failure to ELP wakeup, fail in the FW command. >> > Instead of no-op , trigger a recovery which would >> > mark sched scan as stopped and clear up any bad FW state. >> > >> > Signed-off-by: Eyal Shapira <eyal@wizery.com> >> > --- >> this patch seems a bit redundant. >> >> > ret = wl1271_ps_elp_wakeup(wl); >> > - if (ret < 0) >> > + if (ret < 0) { >> > + wl12xx_queue_recovery_work(wl); >> > goto out; >> > + } >> > >> we enqueue recovery in wl1271_ps_elp_wakeup() in most error paths. >> ok. agreed. what about the path of completion error in wl1271_ps_elp_wakeup() ? Why not trigger recovery in that case as well ? > >> >> > - /* FIXME: what to do if alloc'ing to stop fails? */ >> > stop = kzalloc(sizeof(*stop), GFP_KERNEL); >> > if (!stop) { >> > wl1271_error("failed to alloc memory to send sched scan stop"); >> > - return; >> > + goto recovery; >> > } >> not sure if initiating recovery here will really help. it'll probably >> fail as well, and good chances we are going to panic soon anyway :) >> Luca pointed out the same thing so I'm dropping this. The question is what should we do with failing alloc as op_sched_scan_stop should be void. Due to other changes in the sched_scan_stop path in mac/cfg80211 it's more important to call ieee80211_sched_scan_stoppeD() as otherwise allocs done in mac80211 won't be freed as well as the userspace won't be notified of sched scan stop. Options: 1. Call ieee80211_sched_scan_stopped() for any failure (OOM, elp_wakeup, ...). This would free up the allocs in mac80211 and notify userspace but would leave the FW out of sync with the upper layers. The next sched scan initiated would trigger a recovery given that the previous sched scan wasn't stopped. 2. No op - Just ignore kzalloc failure. Sched scan can't be stopped in this case and the userspace won't get any completion event and alloc in mac80211 will remain. 3. Variation of 1 - Call stopped() but propagate to userspace through the NL event that we couldn't really stop. I don't think that would fly as it's more of an API change to userspace. I understand that option #2 is what we're going for lacking a better alternative. Any other ideas / opinions ? >> > ret = wl1271_cmd_send(wl, CMD_STOP_PERIODIC_SCAN, stop, >> > sizeof(*stop), 0); >> > + >> > + kfree(stop); >> > + >> > if (ret < 0) { >> > wl1271_error("failed to send sched scan stop command"); >> > - goto out_free; >> > + goto recovery; >> > } >> > >> wl1271_cmd_send enqueues recovery work on error as well. >> got it. will be dropped. >> >> anyway, the major thing i don't like here is handling the sched_stop >> in a different manner than the rest of the commands. >> >> Eliad. > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] wl12xx: trigger recovery for failures in sched scan stop 2011-12-23 1:11 ` Eyal Shapira @ 2011-12-23 9:13 ` Luciano Coelho 0 siblings, 0 replies; 4+ messages in thread From: Luciano Coelho @ 2011-12-23 9:13 UTC (permalink / raw) To: Eyal Shapira; +Cc: Eliad Peller, linux-wireless, johannes On Fri, 2011-12-23 at 03:11 +0200, Eyal Shapira wrote: > On Fri, Dec 23, 2011 at 3:07 AM, Eyal Shapira <eyal@wizery.com> wrote: > > On Thu, Dec 22, 2011 at 10:59 AM, Eliad Peller <eliad@wizery.com> wrote: > >> On Wed, Dec 21, 2011 at 2:06 PM, Eyal Shapira <eyal@wizery.com> wrote: > >> > mac80211 requires that drv_sched_scan_stop > >> > will always succeed. We have a few possible errors > >> > such as OOM, failure to ELP wakeup, fail in the FW command. > >> > Instead of no-op , trigger a recovery which would > >> > mark sched scan as stopped and clear up any bad FW state. > >> > > >> > Signed-off-by: Eyal Shapira <eyal@wizery.com> > >> > --- > >> this patch seems a bit redundant. > >> > >> > ret = wl1271_ps_elp_wakeup(wl); > >> > - if (ret < 0) > >> > + if (ret < 0) { > >> > + wl12xx_queue_recovery_work(wl); > >> > goto out; > >> > + } > >> > > >> we enqueue recovery in wl1271_ps_elp_wakeup() in most error paths. > >> > ok. agreed. what about the path of completion error in wl1271_ps_elp_wakeup() ? > Why not trigger recovery in that case as well ? If the wake up times out, the firmware has very likely crashed or is stuck, so recovery is needed. If the wake up _fails_, on the other hand, things are probably not too bad. We could probably recover without recovery. :P In most cases this will generate a cascade of failures and things may go wrong, but in other cases, such as debugfs, a failure is not bad. > >> > >> > - /* FIXME: what to do if alloc'ing to stop fails? */ > >> > stop = kzalloc(sizeof(*stop), GFP_KERNEL); > >> > if (!stop) { > >> > wl1271_error("failed to alloc memory to send sched scan stop"); > >> > - return; > >> > + goto recovery; > >> > } > >> not sure if initiating recovery here will really help. it'll probably > >> fail as well, and good chances we are going to panic soon anyway :) > >> > Luca pointed out the same thing so I'm dropping this. Okay, I will drop it for now and wait for v2 or a different solution. > The question is what should we do with failing alloc as > op_sched_scan_stop should be void. We could change the op_sched_scan_stop to return int instead. At least in this case there's a good reason for doing it. We don't need to return the error to the userspace (as it happens now in case of failure), but we can at least use that to trigger the deallocation in mac80211. > Due to other changes in the sched_scan_stop path in mac/cfg80211 it's > more important to > call ieee80211_sched_scan_stoppeD() as otherwise allocs done in > mac80211 won't be freed as well as the userspace > won't be notified of sched scan stop. Let's see what comes out of our discussion in the other thread. ;) > Options: > 1. Call ieee80211_sched_scan_stopped() for any failure (OOM, > elp_wakeup, ...). This would free up the allocs in mac80211 and notify > userspace > but would leave the FW out of sync with the upper layers. The next > sched scan initiated would trigger a recovery given that the previous > sched > scan wasn't stopped. This also depends on what happens to the changes in cfg/mac80211. > 2. No op - Just ignore kzalloc failure. Sched scan can't be stopped in > this case and the userspace won't get any completion event and alloc > in > mac80211 will remain. I don't think this is an option. It will leak memory, so the OOM will just get worse. The userspace might have a timeout and retry the stop after sometime, which would make things even worse. > 3. Variation of 1 - Call stopped() but propagate to userspace through > the NL event that we couldn't really stop. > I don't think that would fly as it's more of an API change to userspace. Not so good. As I mentioned in the other thread, maybe we could delay the stop command completion? > I understand that option #2 is what we're going for lacking a better > alternative. > Any other ideas / opinions ? Let's wait and see what comes out of the cfg/mac80211 discussion. > >> anyway, the major thing i don't like here is handling the sched_stop > >> in a different manner than the rest of the commands. This was exactly one of my concerns when I mentioned privately to Eyal that this looked a bit weird. I don't see a good reason why sched_scan_stop would have more dramatic failure consequences than other commands. -- Cheers, Luca. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-12-23 9:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-21 12:06 [PATCH] wl12xx: trigger recovery for failures in sched scan stop Eyal Shapira
2011-12-22 8:59 ` Eliad Peller
[not found] ` <CAF8O4-zh25pd7T937tbmRcFgmTVpZPtJbFP8OJKhgcaeSSqaSw@mail.gmail.com>
2011-12-23 1:11 ` Eyal Shapira
2011-12-23 9:13 ` Luciano Coelho
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox