Linux wireless drivers development
 help / color / mirror / Atom feed
From: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
To: Matthew Leach <matthew.leach@collabora.com>,
	Jeff Johnson <jjohnson@kernel.org>
Cc: linux-wireless@vger.kernel.org, ath11k@lists.infradead.org,
	linux-kernel@vger.kernel.org, kernel@collabora.com
Subject: Re: [PATCH RESEND RFC 1/3] net: ath11k: fix redundant reset from stale pending workqueue bit
Date: Tue, 12 May 2026 16:09:40 -0700	[thread overview]
Message-ID: <e9a822df-f0e1-464b-af99-c1ca315ec5cf@oss.qualcomm.com> (raw)
In-Reply-To: <20260330-ath11k-lockup-fixes-v1-1-7ed21095c2c4@collabora.com>

On 3/30/2026 3:05 AM, Matthew Leach wrote:
> During a firmware lockup, WMI commands time out in rapid succession,
> each calling queue_work() to schedule ath11k_core_reset().  This can
> cause a spurious extra reset after recovery completes:
> 
> 1. First WMI timeout calls queue_work(), sets the pending bit and
>    schedules ath11k_core_reset(). The workqueue clears the pending bit
>    before invoking the work function. reset_count becomes 1 and the reset
>    is kicked off asynchronously. ath11k_core_reset() returns.
> 
> 2. Second WMI timeout calls queue_work() and re-queues the work. When it
>    runs after step 1 returns, it sees reset_count > 1 and blocks in
>    wait_for_completion(). The pending bit is again cleared.
> 
> 3. Third WMI timeout calls queue_work(), the pending bit was cleared in
>    step 2, so this succeeds and arms another execution.
> 
> 4. The asynchronous reset finishes. ath11k_mac_op_reconfig_complete()
>    decrements reset_count and calls complete(). The blocked worker from
>    step 2 wakes, takes the early-exit path, and decrements reset_count to
>    0.
> 
> 5. The workqueue sees the pending bit from step 3 and runs
>    ath11k_core_reset() again. reset_count is 0, triggering a
>    full redundant hardware reset.
> 
> Fix this by calling cancel_work() on reset_work in
> ath11k_mac_op_reconfig_complete() before signalling completion. This
> clears any stale pending bit, preventing the spurious re-execution.
> 
> Signed-off-by: Matthew Leach <matthew.leach@collabora.com>
> ---
>  drivers/net/wireless/ath/ath11k/mac.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
> index e4ee2ba1f669..748f779b3d1b 100644
> --- a/drivers/net/wireless/ath/ath11k/mac.c
> +++ b/drivers/net/wireless/ath/ath11k/mac.c
> @@ -9274,6 +9274,10 @@ ath11k_mac_op_reconfig_complete(struct ieee80211_hw *hw,
>  			 * the recovery has to be done for each radio
>  			 */
>  			if (recovery_count == ab->num_radios) {
> +				/* Cancel any pending work, preventing a second redudant

nits:
1) networking no longer uses a different block comment style so use the
standard style where /* is on a line by itself
2: s/redudant/redundant/ (subject has it right)

but don't post a new version just for these -- wait for any other comments.
I'm pinging the development team to look at this thread.

> +				 * reset.
> +				 */
> +				cancel_work(&ab->reset_work);
>  				atomic_dec(&ab->reset_count);
>  				complete(&ab->reset_complete);
>  				ab->is_reset = false;
> 


  reply	other threads:[~2026-05-12 23:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 10:05 [RFC PATCH RESEND 0/3] net: ath11k: Firmware lockup detection & mitigation Matthew Leach
2026-03-30 10:05 ` [PATCH RESEND RFC 1/3] net: ath11k: fix redundant reset from stale pending workqueue bit Matthew Leach
2026-05-12 23:09   ` Jeff Johnson [this message]
2026-03-30 10:05 ` [PATCH RESEND RFC 2/3] net: ath11k: add firmware lockup detection and recovery Matthew Leach
2026-03-30 10:05 ` [PATCH RESEND RFC 3/3] net: ath11k: add lockup simulation via debugfs Matthew Leach
2026-05-12 23:19   ` Jeff Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9a822df-f0e1-464b-af99-c1ca315ec5cf@oss.qualcomm.com \
    --to=jeff.johnson@oss.qualcomm.com \
    --cc=ath11k@lists.infradead.org \
    --cc=jjohnson@kernel.org \
    --cc=kernel@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=matthew.leach@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox