From: Cole Leavitt <cole@unwrap.rs>
To: linux-wireless@vger.kernel.org
Cc: greearb@candelatech.com, miriam.rachel.korenblit@intel.com,
johannes@sipsolutions.net, cole@unwrap.rs,
stable@vger.kernel.org
Subject: [PATCH v3 1/3] wifi: iwlwifi: add STATUS_FW_ERROR guards to NAPI/TX-notif paths
Date: Mon, 20 Apr 2026 10:44:04 -0700 [thread overview]
Message-ID: <20260420174406.128254-2-cole@unwrap.rs> (raw)
In-Reply-To: <20260420174406.128254-1-cole@unwrap.rs>
After firmware error is detected and STATUS_FW_ERROR is set, NAPI may
still be in-flight from a prior interrupt or get scheduled by the MSIX
IRQ handler before the error bit is processed. The NAPI poll functions
have no STATUS_FW_ERROR check and will continue processing stale RX ring
entries from dying firmware.
iwl_trans_reclaim() already early-returns on STATUS_FW_ERROR, so any
TX-response notification that makes it through to reclaim is a no-op.
What remains is:
* CPU spent parsing stale RX inside iwl_pcie_rx_handle() before
dispatching to the op_mode.
* No signal in the logs when the race fires, making the
post-FW-error sequence harder to debug.
Add STATUS_FW_ERROR early-returns with WARN_ONCE() in four places:
* iwl_pcie_napi_poll() (legacy NAPI poll)
* iwl_pcie_napi_poll_msix() (MSIX NAPI poll)
* iwl_mld_handle_tx_resp_notif()
* iwl_mld_handle_compressed_ba_notif()
Rationale:
1. Stop NAPI from consuming any more RX budget once firmware is
declared dead; the restart path will re-initialise the rings.
2. Provide a single, one-shot log line via WARN_ONCE so we can tell
from a user's dmesg whether the post-error race actually fired in
their configuration, which has been hard to reproduce outside
Ben Greear's test rig.
_iwl_trans_pcie_gen2_stop_device() already calls iwl_pcie_rx_napi_sync()
to quiesce NAPI during device teardown, but that runs much later in the
restart sequence; these checks close the window between error detection
and device stop.
Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver")
Cc: stable@vger.kernel.org
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
---
drivers/net/wireless/intel/iwlwifi/mld/tx.c | 19 +++++++++++++++++++
drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c | 18 ++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
index 546d09a38dab..e341d12e5233 100644
--- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
@@ -1082,6 +1082,15 @@ void iwl_mld_handle_tx_resp_notif(struct iwl_mld *mld,
bool mgmt = false;
bool tx_failure = (status & TX_STATUS_MSK) != TX_STATUS_SUCCESS;
+ /* iwl_trans_reclaim() already guards on STATUS_FW_ERROR, but
+ * bail out earlier (and log once) so we can tell from dmesg
+ * whether this race actually fires in the field.
+ */
+ if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+ WARN_ONCE(1, "iwlwifi: TX resp notif (sta=%d txq=%d) after FW error\n",
+ sta_id, txq_id);
+ return;
+ }
if (IWL_FW_CHECK(mld, tx_resp->frame_count != 1,
"Invalid tx_resp notif frame_count (%d)\n",
tx_resp->frame_count))
@@ -1360,6 +1369,16 @@ void iwl_mld_handle_compressed_ba_notif(struct iwl_mld *mld,
u8 sta_id = ba_res->sta_id;
struct ieee80211_link_sta *link_sta;
+ /* Same rationale as iwl_mld_handle_tx_resp_notif: redundant with
+ * iwl_trans_reclaim()'s own STATUS_FW_ERROR check, but fails fast
+ * and logs via WARN_ONCE when the race is actually hit.
+ */
+ if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+ WARN_ONCE(1, "iwlwifi: BA notif (sta=%d) after FW error\n",
+ sta_id);
+ return;
+ }
+
if (!tfd_cnt)
return;
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
index fe263cdc2e4f..554c22777ec1 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
@@ -1012,6 +1012,15 @@ static int iwl_pcie_napi_poll(struct napi_struct *napi, int budget)
trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
trans = trans_pcie->trans;
+ /* Don't process RX for dying firmware; the restart path will
+ * re-init the rings. WARN_ONCE helps surface whether this race
+ * actually fires in user dmesg.
+ */
+ if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+ WARN_ONCE(1, "iwlwifi: NAPI poll[%d] invoked after FW error\n",
+ rxq->id);
+ napi_complete_done(napi, 0);
+ return 0;
+ }
+
ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n",
@@ -1039,6 +1048,15 @@ static int iwl_pcie_napi_poll_msix(struct napi_struct *napi, int budget)
trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
trans = trans_pcie->trans;
+ if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+ WARN_ONCE(1,
+ "iwlwifi: NAPI MSIX poll[%d] invoked after FW error\n",
+ rxq->id);
+ napi_complete_done(napi, 0);
+ return 0;
+ }
+
ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", rxq->id, ret,
budget);
--
2.52.0
next parent reply other threads:[~2026-04-20 17:44 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260420174406.128254-1-cole@unwrap.rs>
2026-04-20 17:44 ` Cole Leavitt [this message]
2026-04-20 17:44 ` [PATCH v3 2/3] wifi: iwlwifi: mld: fix TSO segmentation when AMSDU is disabled Cole Leavitt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260420174406.128254-2-cole@unwrap.rs \
--to=cole@unwrap.rs \
--cc=greearb@candelatech.com \
--cc=johannes@sipsolutions.net \
--cc=linux-wireless@vger.kernel.org \
--cc=miriam.rachel.korenblit@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox