public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: "Korenblit, Miriam Rachel" <miriam.rachel.korenblit@intel.com>
To: Cole Leavitt <cole@unwrap.rs>,
	"greearb@candelatech.com" <greearb@candelatech.com>
Cc: "johannes@sipsolutions.net" <johannes@sipsolutions.net>,
	"linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: RE: [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF
Date: Sun, 22 Mar 2026 12:29:29 +0000	[thread overview]
Message-ID: <DM3PPF63A6024A9EBDFBD21B924D24F3D0BA34AA@DM3PPF63A6024A9.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20260218144723.31699-1-cole@unwrap.rs>



> -----Original Message-----
> From: Cole Leavitt <cole@unwrap.rs>
> Sent: Wednesday, February 18, 2026 4:47 PM
> To: greearb@candelatech.com
> Cc: johannes@sipsolutions.net; linux-wireless@vger.kernel.org; Korenblit, Miriam
> Rachel <miriam.rachel.korenblit@intel.com>; Cole Leavitt <cole@unwrap.rs>
> Subject: [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing
> UAF
> 
> Ben,
> 
> I've been digging into the use-after-free crash you reported on your
> BE200 running the MLD driver (tcp_shifted_skb refcount underflow, followed by
> NULL deref in tcp_rack_detect_loss). I think I found the root cause -- it's a
> missing guard in the MLD TSO segmentation path that lets num_subframes=0
> reach skb_gso_segment(), producing the 32k+ segment explosion you're seeing.
> 
> Here's the full chain:
> 
> 1) mld/tlc.c:790 -- when firmware's TLC notification disables AMSDU for
>    a TID (bit not set in amsdu_enabled), the MLD driver sets:
> 
>      link_sta->agg.max_tid_amsdu_len[i] = 1;
> 
>    This sentinel value 1 means "AMSDU disabled on this TID".
> 
> 2) mld/tx.c:836-837 -- the TSO path checks:
> 
>      max_tid_amsdu_len = sta->cur->max_tid_amsdu_len[tid];
>      if (!max_tid_amsdu_len)   // <-- only catches zero, not 1
>          return iwl_tx_tso_segment(skb, 1, ...);
> 
>    Value 1 passes this check.
> 
> 3) mld/tx.c:847 -- the division produces zero:
> 
>      num_subframes = (1 + 2) / (1534 + 2) = 0
> 
>    Any max_tid_amsdu_len below ~1534 (one subframe) produces 0 here.
> 
> 4) iwl-utils.c:27 -- gso_size is set to zero:
> 
>      skb_shinfo(skb)->gso_size = num_subframes * mss = 0 * 1460 = 0
> 
> 5) iwl-utils.c:30 -- skb_gso_segment() with gso_size=0 creates 32001+
>    tiny segments, which is the error you're seeing:
> 
>      "skbuff: ERROR: Found more than 32000 packets in skb_segment"
>      "iwl-mvm-tx-tso-segment, list gso-segment list is huge: 32001"
> 
> 6) mld/tx.c:912-936 -- the loop queues ~1024 of those segments to the
>    TX ring before it fills up, then purges the rest. This creates a
>    massive burst of tiny frames that stress the BA completion path.
> 
> The MVM driver is immune because it checks mvmsta->amsdu_enabled (a
> separate bitmap) at tx.c:912 and tx.c:936 BEFORE ever reaching the
> num_subframes calculation. MLD has no equivalent -- it relies solely on
> max_tid_amsdu_len, and the sentinel value 1 slips through.
> 
> This explains all your observations:
> - 6.18 regression: BE200 moved from MVM (has guard) to MLD (no guard)
> - AP-specific: the problem AP causes firmware to disable AMSDU for the
>   active TID (other APs enable it, so max_tid_amsdu_len gets a proper
>   value from iwl_mld_get_amsdu_size_of_tid())
> - 28min gap between TSO explosion and UAF: the ~1024 micro-frame burst
>   creates massive alloc/free churn in the skb slab, which can corrupt
>   TCP retransmit queue entries allocated from the same cache
> - No firmware error: firmware is fine, the bug is purely in MLD's TSO
>   parameter calculation
> 
> The fix (in patch 1/1) adds a guard after the num_subframes calculation -- if it's
> zero, fall back to single-subframe TSO (num_subframes=1), which correctly sets
> gso_size=mss. This matches what MVM effectively does via its amsdu_enabled
> checks.
> 
> Could you test this against the problem AP? Two things that would help confirm
> the theory:
> 
> 1) Before applying the fix, add this debug print to see the actual
>    max_tid_amsdu_len value with the problem AP:
> 
>      // In iwl_mld_tx_tso_segment(), after line 847
>      if (!num_subframes)
>          pr_warn_once("iwlmld: num_subframes=0, max_tid_amsdu_len=%u "
>                       "subf_len=%u mss=%u\n",
>                       max_tid_amsdu_len, subf_len, mss);
> 
> 2) After applying the fix, run against the problem AP for 1+ day and
>    check if both the TSO explosion AND the UAF are gone.
> 
> I also noticed a few secondary defense-in-depth regressions in MLD's TX
> completion path vs MVM:
> 
> - MLD's iwl_mld_tx_reclaim_txq() has no per-TID reclaim tracking
>   (MVM has tid_data->next_reclaimed and validates tid_data->txq_id)
> - The transport-level reclaim_lock prevents direct double-free, but
>   MLD is missing MVM's extra safety checks
> 
> These are probably not directly causing your crash, but worth noting.
> 
> Cole Leavitt (1):
>   wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is
>     disabled
> 
>  drivers/net/wireless/intel/iwlwifi/mld/tx.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> --
> 2.52.0

Thank you for the clear analysis!

Miri



  parent reply	other threads:[~2026-03-22 12:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <c6f886d4-b9ed-48a6-9723-a738af055b64@candelatech.com>
2026-02-14 18:10 ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Cole Leavitt
     [not found]   ` <5be8a502-d53a-4cce-821f-202368c44f6d@candelatech.com>
2026-02-14 18:33     ` Cole Leavitt
2026-02-16 18:12       ` Ben Greear
2026-02-18 14:44         ` Cole Leavitt
2026-02-18 14:44         ` Cole Leavitt
2026-02-18 14:47         ` [PATCH 0/1] wifi: iwlwifi: mld: fix TSO segmentation explosion causing UAF Cole Leavitt
2026-02-18 14:47           ` [PATCH 1/1] wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabled Cole Leavitt
2026-03-22 12:28             ` Korenblit, Miriam Rachel
2026-03-22 12:29           ` Korenblit, Miriam Rachel [this message]
2026-02-18 17:35         ` [PATCH] wifi: iwlwifi: prevent NAPI processing after firmware error Ben Greear
2026-02-14 18:41   ` Cole Leavitt
2026-02-14 18:43   ` [PATCH v3] " Cole Leavitt
2026-02-26 19:37     ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM3PPF63A6024A9EBDFBD21B924D24F3D0BA34AA@DM3PPF63A6024A9.namprd11.prod.outlook.com \
    --to=miriam.rachel.korenblit@intel.com \
    --cc=cole@unwrap.rs \
    --cc=greearb@candelatech.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox