From: Cole Leavitt <cole@unwrap.rs>
To: johannes.berg@intel.com, miriam.rachel.korenblit@intel.com
Cc: greearb@candelatech.com, linux-wireless@vger.kernel.org,
stable@vger.kernel.org, Cole Leavitt <cole@unwrap.rs>
Subject: [PATCH v3] wifi: iwlwifi: prevent NAPI processing after firmware error
Date: Sat, 14 Feb 2026 11:43:52 -0700 [thread overview]
Message-ID: <20260214184352.11512-1-cole@unwrap.rs> (raw)
In-Reply-To: <20260214181018.6091-1-cole@unwrap.rs>
After a firmware error is detected and STATUS_FW_ERROR is set, NAPI can
still be actively polling or get scheduled from a prior interrupt. The
NAPI poll functions (both legacy and MSIX variants) have no check for
STATUS_FW_ERROR and will continue processing stale RX ring entries from
dying firmware. This can dispatch TX completion notifications containing
corrupt SSN values to iwl_mld_handle_tx_resp_notif(), which passes them
to iwl_trans_reclaim(). If the corrupt SSN causes reclaim to walk TX
queue entries that were already freed by a prior correct reclaim, the
result is an skb use-after-free or double-free.
The race window opens when the MSIX IRQ handler schedules NAPI (lines
2319-2321 in rx.c) before processing the error bit (lines 2382-2396),
or when NAPI is already running on another CPU from a previous interrupt
when STATUS_FW_ERROR gets set on the current CPU.
Add STATUS_FW_ERROR checks to both NAPI poll functions to prevent
processing stale RX data after firmware error, and add early-return
guards in the TX response and compressed BA notification handlers as
defense-in-depth. Each check uses WARN_ONCE to log if the race is
actually hit, which aids diagnosis of the hard-to-reproduce skb
use-after-free reported on Intel BE200.
Note that _iwl_trans_pcie_gen2_stop_device() already calls
iwl_pcie_rx_napi_sync() to quiesce NAPI during device teardown, but that
runs much later in the restart sequence. These checks close the window
between error detection and device stop.
Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver")
Cc: stable@vger.kernel.org
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
---
Changes since v1:
- Added Fixes: tag and Cc: stable@vger.kernel.org
Tested on Intel BE200 (FW 101.6e695a70.0) by forcing NMI via debugfs.
The WARN_ONCE fires reliably:
iwlwifi: NAPI MSIX poll[0] invoked after FW error
WARNING: drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c:1058
at iwl_pcie_napi_poll_msix+0xff/0x130 [iwlwifi], CPU#22
Confirming NAPI poll is invoked after STATUS_FW_ERROR is set. Without
this patch, that poll processes stale RX ring data from dead firmware.
drivers/net/wireless/intel/iwlwifi/mld/tx.c | 19 ++++++++++++++++++
.../wireless/intel/iwlwifi/pcie/gen1_2/rx.c | 20 +++++++++++++++++++
2 files changed, 39 insertions(+)
diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
index 3b4b575aadaa..3e99f3ded9bc 100644
--- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
@@ -1071,6 +1071,18 @@ void iwl_mld_handle_tx_resp_notif(struct iwl_mld *mld,
bool mgmt = false;
bool tx_failure = (status & TX_STATUS_MSK) != TX_STATUS_SUCCESS;
+ /* Firmware is dead — the TX response may contain corrupt SSN values
+ * from a dying firmware DMA. Processing it could cause
+ * iwl_trans_reclaim() to free the wrong TX queue entries, leading to
+ * skb use-after-free or double-free.
+ */
+ if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+ WARN_ONCE(1,
+ "iwlwifi: TX resp notif (sta=%d txq=%d) after FW error\n",
+ sta_id, txq_id);
+ return;
+ }
+
if (IWL_FW_CHECK(mld, tx_resp->frame_count != 1,
"Invalid tx_resp notif frame_count (%d)\n",
tx_resp->frame_count))
@@ -1349,6 +1361,13 @@ void iwl_mld_handle_compressed_ba_notif(struct iwl_mld *mld,
u8 sta_id = ba_res->sta_id;
struct ieee80211_link_sta *link_sta;
+ if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+ WARN_ONCE(1,
+ "iwlwifi: BA notif (sta=%d) after FW error\n",
+ sta_id);
+ return;
+ }
+
if (!tfd_cnt)
return;
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
index 619a9505e6d9..ba18d35fa55d 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
@@ -1015,6 +1015,18 @@ static int iwl_pcie_napi_poll(struct napi_struct *napi, int budget)
trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
trans = trans_pcie->trans;
+ /* Stop processing RX if firmware has crashed. Stale notifications
+ * from dying firmware (e.g. TX completions with corrupt SSN values)
+ * can cause use-after-free in reclaim paths.
+ */
+ if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+ WARN_ONCE(1,
+ "iwlwifi: NAPI poll[%d] invoked after FW error\n",
+ rxq->id);
+ napi_complete_done(napi, 0);
+ return 0;
+ }
+
ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n",
@@ -1042,6 +1054,14 @@ static int iwl_pcie_napi_poll_msix(struct napi_struct *napi, int budget)
trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
trans = trans_pcie->trans;
+ if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+ WARN_ONCE(1,
+ "iwlwifi: NAPI MSIX poll[%d] invoked after FW error\n",
+ rxq->id);
+ napi_complete_done(napi, 0);
+ return 0;
+ }
+
ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", rxq->id, ret,
budget);
--
2.52.0
next prev parent reply other threads:[~2026-02-14 18:45 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <c6f886d4-b9ed-48a6-9723-a738af055b64@candelatech.com>
2026-02-14 18:10 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Cole Leavitt
[not found] ` <5be8a502-d53a-4cce-821f-202368c44f6d@candelatech.com>
2026-02-14 18:33 ` Cole Leavitt
2026-02-16 18:12 ` Ben Greear
2026-02-18 14:44 ` Cole Leavitt
2026-02-18 14:44 ` Cole Leavitt
2026-02-18 14:47 ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Cole Leavitt
2026-02-18 14:47 ` [PATCH 1/1] wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled Cole Leavitt
2026-03-22 12:28 ` Korenblit, Miriam Rachel
2026-03-22 12:29 ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Korenblit, Miriam Rachel
2026-02-18 17:35 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Ben Greear
2026-02-14 18:41 ` Cole Leavitt
2026-02-14 18:43 ` Cole Leavitt [this message]
2026-02-26 19:37 ` [PATCH v3] " Ben Greear
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260214184352.11512-1-cole@unwrap.rs \
--to=cole@unwrap.rs \
--cc=greearb@candelatech.com \
--cc=johannes.berg@intel.com \
--cc=linux-wireless@vger.kernel.org \
--cc=miriam.rachel.korenblit@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox