From: Cole Leavitt <cole@unwrap.rs>
To: greearb@candelatech.com
Cc: johannes.berg@intel.com, miriam.rachel.korenblit@intel.com,
linux-wireless@vger.kernel.org, Cole Leavitt <cole@unwrap.rs>
Subject: Re: [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error
Date: Sat, 14 Feb 2026 11:33:06 -0700 [thread overview]
Message-ID: <20260214183306.10188-1-cole@unwrap.rs> (raw)
In-Reply-To: <5be8a502-d53a-4cce-821f-202368c44f6d@candelatech.com>
Ben,
Good catch on both fronts.
On the build_tfd dangling pointer -- you're right. The failure path at
line 775 leaves entries[idx].skb/cmd pointing at caller-owned objects
(set at lines 763-764). The caller gets -1 and presumably frees the
skb, so entries[idx].skb becomes a dangling pointer. While write_ptr
not advancing means current unmap paths won't iterate to that index,
it's a latent UAF waiting for a flush path change or future code to
touch it. Two NULL stores inside a held spinlock cost nothing. I think
this should go upstream as its own patch.
On the TOCTOU question -- this is the part I spent the most time on.
The window you're asking about is: firmware starts producing corrupt
completion data *before* STATUS_FW_ERROR gets set. Our NAPI/TX handler
checks can't help there because the flag isn't set yet.
The primary guard in that window is iwl_txq_used() in
iwl_pcie_reclaim(). It validates that the firmware's SSN falls within
[read_ptr, write_ptr). This catches wild values -- out-of-range SSNs,
wraparound corruption, etc.
What it can't catch is an in-range corrupt SSN -- e.g., firmware says
reclaim up to index 15 when legitimate is 8, but write_ptr is 20.
That passes bounds checking and the reclaim loop frees skbs for
entries still in-flight (active DMA). The NULL skb WARN_ONCE in the
loop catches double-reclaim but not first-time over-reclaim.
The complete fix for this would be a per-entry generation counter --
tag each entry on submit, validate on reclaim. But that adds per-entry
overhead on the TX hot path to protect against a condition (firmware
producing corrupt completions) that is already terminal. I think the
right trade-off is:
1. Your build_tfd NULL fix (eliminates one dangling pointer class)
2. STATUS_FW_ERROR checks in NAPI poll + TX handlers (this series --
shrinks the detection window to near-zero)
3. The existing iwl_txq_used() bounds check (catches most corrupt
SSNs)
Together these make the damage window small enough that a per-entry
generation scheme isn't justified -- by the time firmware is sending
corrupt SSNs, we're in dump-and-reset territory anyway.
That said, if you're seeing corruption patterns in your customer
testing where a valid-looking-but-wrong SSN gets through before
FW_ERROR fires, I'd be very interested in the traces. That would
change the cost/benefit on the generation counter approach.
Thanks,
Cole
next prev parent reply other threads:[~2026-02-14 18:35 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <c6f886d4-b9ed-48a6-9723-a738af055b64@candelatech.com>
2026-02-14 18:10 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Cole Leavitt
[not found] ` <5be8a502-d53a-4cce-821f-202368c44f6d@candelatech.com>
2026-02-14 18:33 ` Cole Leavitt [this message]
2026-02-16 18:12 ` Ben Greear
2026-02-18 14:44 ` Cole Leavitt
2026-02-18 14:44 ` Cole Leavitt
2026-02-18 14:47 ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Cole Leavitt
2026-02-18 14:47 ` [PATCH 1/1] wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled Cole Leavitt
2026-03-22 12:28 ` Korenblit, Miriam Rachel
2026-03-22 12:29 ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Korenblit, Miriam Rachel
2026-02-18 17:35 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Ben Greear
2026-02-14 18:41 ` Cole Leavitt
2026-02-14 18:43 ` [PATCH v3] " Cole Leavitt
2026-02-26 19:37 ` Ben Greear
[not found] <7f72ac08-6b4a-486b-a8f9-7b78ea0f5ae1@candelatech.com>
2026-02-18 18:47 ` [PATCH] " Cole Leavitt
2026-02-19 16:38 ` Ben Greear
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260214183306.10188-1-cole@unwrap.rs \
--to=cole@unwrap.rs \
--cc=greearb@candelatech.com \
--cc=johannes.berg@intel.com \
--cc=linux-wireless@vger.kernel.org \
--cc=miriam.rachel.korenblit@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox