From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.unwrap.rs (mail.unwrap.rs [172.232.15.166]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7081542A9D; Sat, 14 Feb 2026 18:43:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=172.232.15.166 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771094592; cv=none; b=i8QIH+cZ12AJ+IezoEya4yn0DhUXAxqFb3scMqhFOEWGGk648L6piCeMe+GHkQ13AycaRNJ6z4zA1kCHxsRStNxc6n+Q46cMHZg0bJztRnv3h/E2ksCchswIp7mEmKS7FBbBypTUSe2CQgefZ1DzIfljhFqAiRioqVAXWc0CQ0c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771094592; c=relaxed/simple; bh=tlnb9P5Y13z0f4mJObtljP+6s3KeyEPGnGQYpv+oVm8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JAfuhhw91nxZ4fotVDg6QRrybw0hiCOHg54QqSlAiIxT4AC0vThdC7+Xecd01MKIrPbd/Dt9KKDOUJX6oj+VdPMWTgCjV5+SNC6+6SKZ7DXLwUVOgSKy11Hh1lJzEP1hDzB7o4WX/mjWrYOpvvhVP2D9v6R1K6UO/IVLViUTyOk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=unwrap.rs; spf=pass smtp.mailfrom=unwrap.rs; arc=none smtp.client-ip=172.232.15.166 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=unwrap.rs Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=unwrap.rs From: Cole Leavitt To: johannes.berg@intel.com, miriam.rachel.korenblit@intel.com Cc: greearb@candelatech.com, linux-wireless@vger.kernel.org, stable@vger.kernel.org, Cole Leavitt Subject: [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Date: Sat, 14 Feb 2026 11:41:16 -0700 Message-ID: <20260214184116.11250-1-cole@unwrap.rs> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260214181018.6091-1-cole@unwrap.rs> References: <20260214181018.6091-1-cole@unwrap.rs> Precedence: bulk X-Mailing-List: linux-wireless@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After a firmware error is detected and STATUS_FW_ERROR is set, NAPI can still be actively polling or get scheduled from a prior interrupt. The NAPI poll functions (both legacy and MSIX variants) have no check for STATUS_FW_ERROR and will continue processing stale RX ring entries from dying firmware. This can dispatch TX completion notifications containing corrupt SSN values to iwl_mld_handle_tx_resp_notif(), which passes them to iwl_trans_reclaim(). If the corrupt SSN causes reclaim to walk TX queue entries that were already freed by a prior correct reclaim, the result is an skb use-after-free or double-free. The race window opens when the MSIX IRQ handler schedules NAPI (lines 2319-2321 in rx.c) before processing the error bit (lines 2382-2396), or when NAPI is already running on another CPU from a previous interrupt when STATUS_FW_ERROR gets set on the current CPU. Add STATUS_FW_ERROR checks to both NAPI poll functions to prevent processing stale RX data after firmware error, and add early-return guards in the TX response and compressed BA notification handlers as defense-in-depth. Each check uses WARN_ONCE to log if the race is actually hit, which aids diagnosis of the hard-to-reproduce skb use-after-free reported on Intel BE200. Note that _iwl_trans_pcie_gen2_stop_device() already calls iwl_pcie_rx_napi_sync() to quiesce NAPI during device teardown, but that runs much later in the restart sequence. These checks close the window between error detection and device stop. Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver") Cc: stable@vger.kernel.org Signed-off-by: Cole Leavitt --- drivers/net/wireless/intel/iwlwifi/mld/tx.c | 19 ++++++++++++++++++ .../wireless/intel/iwlwifi/pcie/gen1_2/rx.c | 20 +++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c index 3b4b575aadaa..3e99f3ded9bc 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c @@ -1071,6 +1071,18 @@ void iwl_mld_handle_tx_resp_notif(struct iwl_mld *mld, bool mgmt = false; bool tx_failure = (status & TX_STATUS_MSK) != TX_STATUS_SUCCESS; + /* Firmware is dead — the TX response may contain corrupt SSN values + * from a dying firmware DMA. Processing it could cause + * iwl_trans_reclaim() to free the wrong TX queue entries, leading to + * skb use-after-free or double-free. + */ + if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) { + WARN_ONCE(1, + "iwlwifi: TX resp notif (sta=%d txq=%d) after FW error\n", + sta_id, txq_id); + return; + } + if (IWL_FW_CHECK(mld, tx_resp->frame_count != 1, "Invalid tx_resp notif frame_count (%d)\n", tx_resp->frame_count)) @@ -1349,6 +1361,13 @@ void iwl_mld_handle_compressed_ba_notif(struct iwl_mld *mld, u8 sta_id = ba_res->sta_id; struct ieee80211_link_sta *link_sta; + if (unlikely(test_bit(STATUS_FW_ERROR, &mld->trans->status))) { + WARN_ONCE(1, + "iwlwifi: BA notif (sta=%d) after FW error\n", + sta_id); + return; + } + if (!tfd_cnt) return; diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c index 619a9505e6d9..ba18d35fa55d 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/gen1_2/rx.c @@ -1015,6 +1015,18 @@ static int iwl_pcie_napi_poll(struct napi_struct *napi, int budget) trans_pcie = iwl_netdev_to_trans_pcie(napi->dev); trans = trans_pcie->trans; + /* Stop processing RX if firmware has crashed. Stale notifications + * from dying firmware (e.g. TX completions with corrupt SSN values) + * can cause use-after-free in reclaim paths. + */ + if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) { + WARN_ONCE(1, + "iwlwifi: NAPI poll[%d] invoked after FW error\n", + rxq->id); + napi_complete_done(napi, 0); + return 0; + } + ret = iwl_pcie_rx_handle(trans, rxq->id, budget); IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", @@ -1042,6 +1054,14 @@ static int iwl_pcie_napi_poll_msix(struct napi_struct *napi, int budget) trans_pcie = iwl_netdev_to_trans_pcie(napi->dev); trans = trans_pcie->trans; + if (unlikely(test_bit(STATUS_FW_ERROR, &trans->status))) { + WARN_ONCE(1, + "iwlwifi: NAPI MSIX poll[%d] invoked after FW error\n", + rxq->id); + napi_complete_done(napi, 0); + return 0; + } + ret = iwl_pcie_rx_handle(trans, rxq->id, budget); IWL_DEBUG_ISR(trans, "[%d] handled %d, budget %d\n", rxq->id, ret, budget); -- 2.52.0