public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: Cole Leavitt <cole@unwrap.rs>
To: johannes.berg@intel.com, miriam.rachel.korenblit@intel.com
Cc: greearb@candelatech.com, linux-wireless@vger.kernel.org,
	stable@vger.kernel.org, Cole Leavitt <cole@unwrap.rs>
Subject: [PATCH v3] wifi: iwlwifi: prevent NAPI processing after firmware error
Date: Sat, 14 Feb 2026 11:43:52 -0700	[thread overview]
Message-ID: <20260214184352.11512-1-cole@unwrap.rs> (raw)
In-Reply-To: <20260214181018.6091-1-cole@unwrap.rs>

After a firmware error is detected and STATUS_FW_ERROR is set, NAPI can
still be actively polling or get scheduled from a prior interrupt. The
NAPI poll functions (both legacy and MSIX variants) have no check for
STATUS_FW_ERROR and will continue processing stale RX ring entries from
dying firmware. This can dispatch TX completion notifications containing
corrupt SSN values to iwl_mld_handle_tx_resp_notif(), which passes them
to iwl_trans_reclaim(). If the corrupt SSN causes reclaim to walk TX
queue entries that were already freed by a prior correct reclaim, the
result is an skb use-after-free or double-free.

The race window opens when the MSIX IRQ handler schedules NAPI (lines
2319-2321 in rx.c) before processing the error bit (lines 2382-2396),
or when NAPI is already running on another CPU from a previous interrupt
when STATUS_FW_ERROR gets set on the current CPU.

Add STATUS_FW_ERROR checks to both NAPI poll functions to prevent
processing stale RX data after firmware error, and add early-return
guards in the TX response and compressed BA notification handlers as
defense-in-depth. Each check uses WARN_ONCE to log if the race is
actually hit, which aids diagnosis of the hard-to-reproduce skb
use-after-free reported on Intel BE200.

Note that _iwl_trans_pcie_gen2_stop_device() already calls
iwl_pcie_rx_napi_sync() to quiesce NAPI during device teardown, but that
runs much later in the restart sequence. These checks close the window
between error detection and device stop.

Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver")
Cc: stable@vger.kernel.org
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
---
Changes since v1:
  - Added Fixes: tag and Cc: stable@vger.kernel.org

Tested on Intel BE200 (FW 101.6e695a70.0) by forcing NMI via debugfs.
The WARN_ONCE fires reliably:

  iwlwifi: NAPI MSIX poll[0] invoked after FW error
  WARNING: drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c:1058
           at iwl_pcie_napi_poll_msix+0xff/0x130 [iwlwifi], CPU#22

Confirming NAPI poll is invoked after STATUS_FW_ERROR is set. Without
this patch, that poll processes stale RX ring data from dead firmware.

 drivers/net/wireless/intel/iwlwifi/mld/tx.c   | 19 ++++++++++++++++++
 .../wireless/intel/iwlwifi/pcie/gen1_2/rx.c   | 20 +++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
index 3b4b575aadaa..3e99f3ded9bc 100644
--- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c
@@ -1071,6 +1071,18 @@ void iwl_mld_handle_tx_resp_notif(struct iwl_mld *mld,
 	bool mgmt = false;
 	bool tx_failure = (status & TX_STATUS_MSK) != TX_STATUS_SUCCESS;
 
+	/* Firmware is dead — the TX response may contain corrupt SSN values
+	 * from a dying firmware DMA. Processing it could cause
+	 * iwl_trans_reclaim() to free the wrong TX queue entries, leading to
+	 * skb use-after-free or double-free.
+	 */
+	if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+		WARN_ONCE(1,
+			  "iwlwifi: TX resp notif (sta=%d txq=%d) after FW error\n",
+			  sta_id, txq_id);
+		return;
+	}
+
 	if (IWL_FW_CHECK(mld, tx_resp->frame_count != 1,
 			 "Invalid tx_resp notif frame_count (%d)\n",
 			 tx_resp->frame_count))
@@ -1349,6 +1361,13 @@ void iwl_mld_handle_compressed_ba_notif(struct iwl_mld *mld,
 	u8 sta_id = ba_res->sta_id;
 	struct ieee80211_link_sta *link_sta;
 
+	if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) {
+		WARN_ONCE(1,
+			  "iwlwifi: BA notif (sta=%d) after FW error\n",
+			  sta_id);
+		return;
+	}
+
 	if (!tfd_cnt)
 		return;
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
index 619a9505e6d9..ba18d35fa55d 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c
@@ -1015,6 +1015,18 @@ static int iwl_pcie_napi_poll(struct napi_struct *napi, int budget)
 	trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
 	trans = trans_pcie->trans;
 
+	/* Stop processing RX if firmware has crashed. Stale notifications
+	 * from dying firmware (e.g. TX completions with corrupt SSN values)
+	 * can cause use-after-free in reclaim paths.
+	 */
+	if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+		WARN_ONCE(1,
+			  "iwlwifi: NAPI poll[%d] invoked after FW error\n",
+			  rxq->id);
+		napi_complete_done(napi, 0);
+		return 0;
+	}
+
 	ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
 
 	IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n",
@@ -1042,6 +1054,14 @@ static int iwl_pcie_napi_poll_msix(struct napi_struct *napi, int budget)
 	trans_pcie = iwl_netdev_to_trans_pcie(napi->dev);
 	trans = trans_pcie->trans;
 
+	if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) {
+		WARN_ONCE(1,
+			  "iwlwifi: NAPI MSIX poll[%d] invoked after FW error\n",
+			  rxq->id);
+		napi_complete_done(napi, 0);
+		return 0;
+	}
+
 	ret = iwl_pcie_rx_handle(trans, rxq->id, budget);
 	IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", rxq->id, ret,
 		      budget);
-- 
2.52.0


  parent reply	other threads:[~2026-02-14 18:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <c6f886d4-b9ed-48a6-9723-a738af055b64@candelatech.com>
2026-02-14 18:10 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Cole Leavitt
     [not found]   ` <5be8a502-d53a-4cce-821f-202368c44f6d@candelatech.com>
2026-02-14 18:33     ` Cole Leavitt
2026-02-16 18:12       ` Ben Greear
2026-02-18 14:44         ` Cole Leavitt
2026-02-18 14:44         ` Cole Leavitt
2026-02-18 14:47         ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Cole Leavitt
2026-02-18 14:47           ` [PATCH 1/1] wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled Cole Leavitt
2026-03-22 12:28             ` Korenblit, Miriam Rachel
2026-03-22 12:29           ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Korenblit, Miriam Rachel
2026-02-18 17:35         ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Ben Greear
2026-02-14 18:41   ` Cole Leavitt
2026-02-14 18:43   ` Cole Leavitt [this message]
2026-02-26 19:37     ` [PATCH v3] " Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260214184352.11512-1-cole@unwrap.rs \
    --to=cole@unwrap.rs \
    --cc=greearb@candelatech.com \
    --cc=johannes.berg@intel.com \
    --cc=linux-wireless@vger.kernel.org \
    --cc=miriam.rachel.korenblit@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox