public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Cole Leavitt <cole@unwrap.rs>
To: linux-wireless@vger.kernel.org
Cc: greearb@candelatech.com, miriam.rachel.korenblit@intel.com,
	johannes@sipsolutions.net, cole@unwrap.rs,
	stable@vger.kernel.org
Subject: [PATCH v3 1/3] wifi: iwlwifi: add STATUS_FW_ERROR guards to NAPI/TX-notif paths
Date: Mon, 20 Apr 2026 10:44:04 -0700	[thread overview]
Message-ID: <20260420174406.128254-2-cole@unwrap.rs> (raw)
In-Reply-To: <20260420174406.128254-1-cole@unwrap.rs>

After firmware error is detected and STATUS_FW_ERROR is set, NAPI may
still be in-flight from a prior interrupt or get scheduled by the MSIX
IRQ handler before the error bit is processed.  The NAPI poll functions
have no STATUS_FW_ERROR check and will continue processing stale RX ring
entries from dying firmware.

iwl_trans_reclaim() already early-returns on STATUS_FW_ERROR, so any
TX-response notification that makes it through to reclaim is a no-op.
What remains is:

  * CPU spent parsing stale RX inside iwl_pcie_rx_handle() before
    dispatching to the op_mode.
  * No signal in the logs when the race fires, making the
    post-FW-error sequence harder to debug.

Add STATUS_FW_ERROR early-returns with WARN_ONCE() in four places:

  * iwl_pcie_napi_poll()        (legacy NAPI poll)
  * iwl_pcie_napi_poll_msix()   (MSIX NAPI poll)
  * iwl_mld_handle_tx_resp_notif()
  * iwl_mld_handle_compressed_ba_notif()

Rationale:

  1. Stop NAPI from consuming any more RX budget once firmware is
     declared dead; the restart path will re-initialise the rings.
  2. Provide a single, one-shot log line via WARN_ONCE so we can tell
     from a user's dmesg whether the post-error race actually fired in
     their configuration, which has been hard to reproduce outside
     Ben Greear's test rig.

_iwl_trans_pcie_gen2_stop_device() already calls iwl_pcie_rx_napi_sync()
to quiesce NAPI during device teardown, but that runs much later in the
restart sequence; these checks close the window between error detection
and device stop.

Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver")
Cc: stable@vger.kernel.org
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
---
 drivers/net/wireless/intel/iwlwifi/mld/tx.c       | 19 +++++++++++++++++++
 drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c | 18 ++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
index 546d09a38dab..e341d12e5233 100644
--- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
@@ -1082,6 +1082,15 @@ void iwl_mld_handle_tx_resp_notif(struct iwl_mld *mld,
 	bool mgmt = false;
 	bool tx_failure = (status & TX_STATUS_MSK) != TX_STATUS_SUCCESS;
 
+	/* iwl_trans_reclaim() already guards on STATUS_FW_ERROR, but
+	 * bail out earlier (and log once) so we can tell from dmesg
+	 * whether this race actually fires in the field.
+	 */
+	if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+		WARN_ONCE(1, "iwlwifi: TX resp notif (sta=%d txq=%d) after FW error\n",
+			  sta_id, txq_id);
+		return;
+	}
 	if (IWL_FW_CHECK(mld, tx_resp->frame_count != 1,
 			 "Invalid tx_resp notif frame_count (%d)\n",
 			 tx_resp->frame_count))
@@ -1360,6 +1369,16 @@ void iwl_mld_handle_compressed_ba_notif(struct iwl_mld *mld,
 	u8 sta_id = ba_res->sta_id;
 	struct ieee80211_link_sta *link_sta;
 
+	/* Same rationale as iwl_mld_handle_tx_resp_notif: redundant with
+	 * iwl_trans_reclaim()'s own STATUS_FW_ERROR check, but fails fast
+	 * and logs via WARN_ONCE when the race is actually hit.
+	 */
+	if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+		WARN_ONCE(1, "iwlwifi: BA notif (sta=%d) after FW error\n",
+			  sta_id);
+		return;
+	}
+
 	if (!tfd_cnt)
 		return;
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
index fe263cdc2e4f..554c22777ec1 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
@@ -1012,6 +1012,15 @@ static int iwl_pcie_napi_poll(struct napi_struct *napi, int budget)
 	trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
 	trans = trans_pcie->trans;
 
+	/* Don't process RX for dying firmware; the restart path will
+	 * re-init the rings.  WARN_ONCE helps surface whether this race
+	 * actually fires in user dmesg.
+	 */
+	if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+		WARN_ONCE(1, "iwlwifi: NAPI poll[%d] invoked after FW error\n",
+			  rxq->id);
+		napi_complete_done(napi, 0);
+		return 0;
+	}
+
 	ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
 
 	IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n",
@@ -1039,6 +1048,15 @@ static int iwl_pcie_napi_poll_msix(struct napi_struct *napi, int budget)
 	trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
 	trans = trans_pcie->trans;
 
+	if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+		WARN_ONCE(1,
+			  "iwlwifi: NAPI MSIX poll[%d] invoked after FW error\n",
+			  rxq->id);
+		napi_complete_done(napi, 0);
+		return 0;
+	}
+
 	ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
 	IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", rxq->id, ret,
 		      budget);
-- 
2.52.0

       reply	other threads:[~2026-04-20 17:44 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260420174406.128254-1-cole@unwrap.rs>
2026-04-20 17:44 ` Cole Leavitt [this message]
2026-04-20 17:44 ` [PATCH v3 2/3] wifi: iwlwifi: mld: fix TSO segmentation when AMSDU is disabled Cole Leavitt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420174406.128254-2-cole@unwrap.rs \
    --to=cole@unwrap.rs \
    --cc=greearb@candelatech.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    --cc=miriam.rachel.korenblit@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox