public inbox for linux-mediatek@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 7.0-6.19] wifi: mt76: avoid to set ACK for MCU command if wait_resp is not set
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
@ 2026-04-20 13:17 ` Sasha Levin
  2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.18] phy: phy-mtk-tphy: Update names and format of kernel-doc comments Sasha Levin
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:17 UTC (permalink / raw)
  To: patches, stable
  Cc: StanleyYP Wang, Shayne Chen, Felix Fietkau, Sasha Levin, lorenzo,
	ryder.lee, matthias.bgg, angelogioacchino.delregno,
	linux-wireless, linux-kernel, linux-arm-kernel, linux-mediatek

From: StanleyYP Wang <StanleyYP.Wang@mediatek.com>

[ Upstream commit 169c83d3df95b57e787174454332e01eb1b823ed ]

When wait_resp is not set but the ACK option is enabled in the MCU TXD,
the ACK event is enqueued to the MCU event queue without being dequeued
by the original MCU command request.

Any orphaned ACK events will only be removed from the queue when another
MCU command requests a response. Due to sequence index mismatches, these
events are discarded one by one until a matching sequence index is found.

However, if several MCU commands that do not require a response continue
to fill up the event queue, there is a risk that when an MCU command with
wait_resp enabled is issued, it may dequeue the wrong event skb,
especially if the queue contains events with all possible sequence
indices.

Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20260203155532.1098290-3-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information to compile the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `wifi: mt76:`
- Action verb: "avoid" (indicating a prevention of incorrect behavior -
  this is a bug fix verb)
- Summary: Prevent setting ACK flag on MCU commands that don't wait for
  a response
- Record: [wifi: mt76] [avoid] [prevent orphaned ACK events in MCU event
  queue]

**Step 1.2: Tags**
- `Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>` -
  author, MediaTek engineer
- `Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>` - regular mt76
  contributor
- `Link: https://patch.msgid.link/20260203155532.1098290-3-
  shayne.chen@mediatek.com` - **patch 3** of a series
- `Signed-off-by: Felix Fietkau <nbd@nbd.name>` - mt76 maintainer,
  merged the patch
- No Fixes: tag (expected for candidates), no Reported-by, no syzbot
- Record: Author is MediaTek HW vendor engineer; committed by subsystem
  maintainer. Part of a series (patch 3).

**Step 1.3: Commit Body**
- Bug: When `wait_resp` is not set, ACK option is still set in MCU TXD.
  Firmware generates ACK events that nobody dequeues.
- Symptom: Orphaned ACK events accumulate in event queue. When a command
  with `wait_resp=true` is issued, it may dequeue a wrong event
  (sequence index mismatch), leading to incorrect MCU communication.
- Failure mode: MCU command/response mismatch, potential driver
  malfunction.
- Record: [MCU event queue pollution by orphaned ACK events] [Wrong
  event dequeued by subsequent commands] [No specific kernel version
  mentioned] [Root cause: ACK option unconditionally set regardless of
  wait_resp]

**Step 1.4: Hidden Bug Fix Detection**
- "avoid to set ACK" = preventing incorrect firmware behavior
- This is explicitly a bug fix disguised with "avoid" rather than "fix"
- Record: Yes, this is a real bug fix. Prevents event queue corruption.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- `drivers/net/wireless/mediatek/mt76/mcu.c`: 1 line changed
- `drivers/net/wireless/mediatek/mt76/mt7996/mcu.c`: ~8 lines changed
  (option logic restructured, SDO special case removed)
- Functions modified: `mt76_mcu_skb_send_and_get_msg()` in mcu.c,
  `mt7996_mcu_send_message()` in mt7996/mcu.c
- Record: [2 files, ~10 lines net change] [Single-subsystem surgical
  fix]

**Step 2.2: Code Flow Change**
- Hunk 1 (mcu.c): Changed `dev->mcu_ops->mcu_skb_send_msg(dev, skb, cmd,
  &seq)` to `dev->mcu_ops->mcu_skb_send_msg(dev, skb, cmd, wait_resp ?
  &seq : NULL)`. Before: always passes seq pointer. After: passes NULL
  when no response needed.
- Hunk 2 (mt7996/mcu.c): Old code always set ACK via
  `MCU_CMD_UNI_QUERY_ACK` or `MCU_CMD_UNI_EXT_ACK`, then special-cased
  SDO to strip ACK. New code builds option from `MCU_CMD_UNI` base,
  conditionally adds `MCU_CMD_SET` and `MCU_CMD_ACK` (only when
  `wait_seq` is non-NULL).
- Record: [Always ACK → conditional ACK based on wait_resp]

**Step 2.3: Bug Mechanism**
- Category: Logic/correctness fix
- Mechanism: The MCU TXD option field had ACK unconditionally set. When
  `wait_resp=false`, the caller never dequeues the resulting ACK event.
  These orphaned events accumulate and can cause subsequent
  `wait_resp=true` commands to get wrong events.
- The fix makes the firmware-facing ACK flag consistent with the driver-
  side intent.
- Record: [Logic/correctness] [Unconditional ACK flag causes orphaned
  events in MCU queue]

**Step 2.4: Fix Quality**
- Verified equivalence: When `wait_seq` is non-NULL, the new option
  values match old values exactly:
  - Query: `MCU_CMD_UNI | MCU_CMD_ACK` = 0x3 = `MCU_CMD_UNI_QUERY_ACK`
  - Non-query: `MCU_CMD_UNI | MCU_CMD_SET | MCU_CMD_ACK` = 0x7 =
    `MCU_CMD_UNI_EXT_ACK`
- The SDO special case removal is correct because SDO commands that
  don't wait will naturally have no ACK.
- Regression risk: Low. All 11 `mcu_skb_send_msg` implementations handle
  NULL `wait_seq` safely (verified via code review).
- Record: [Fix is obviously correct, verified logic equivalence] [Very
  low regression risk]

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
- mcu.c line 101: Introduced by `e452c6eb55fbfd` (Felix Fietkau,
  2020-09-30) - "mt76: move waiting and locking out of
  mcu_ops->mcu_skb_send_msg". The always-pass-seq behavior has been
  present since 2020.
- mt7996/mcu.c option logic: Introduced by `98686cd21624c7` (Shayne
  Chen, 2022-11-22) - initial mt7996 driver commit.
- SDO special case: `dab5b2025452f9` (Peter Chiu, 2025-11-06) - a
  targeted fix for the same class of bug, already in 7.0 tree.
- Record: [Buggy code from 2020 (mcu.c) and 2022 (mt7996)] [Present in
  all kernels since v6.2]

**Step 3.2: No Fixes: tag** - expected, N/A

**Step 3.3: File History**
- mcu.c has had only 4 changes since v6.6 (relicense, SDIO, retry,
  refcount)
- mt7996/mcu.c has had 149 commits since initial driver
- Record: [mcu.c is stable code; mt7996/mcu.c actively developed]

**Step 3.4: Author**
- StanleyYP Wang and Shayne Chen are regular MediaTek mt76 contributors
  (20+ commits each)
- Felix Fietkau is the mt76 subsystem maintainer who merged this
- Record: [Author is subsystem vendor engineer; merged by maintainer]

**Step 3.5: Dependencies**
- Patch 3 of a series (from message-id). Other patches may affect mt7925
  or other files.
- This patch is self-contained: the mcu.c change is a one-line
  conditional, and the mt7996 change is a local restructuring.
- The SDO commit (`dab5b2025452f9`) is already in 7.0 tree, and this
  patch supersedes it.
- Record: [Part of series but functionally standalone for mt7996]

## PHASE 4: MAILING LIST RESEARCH

- lore.kernel.org was behind anti-bot protection; could not fetch.
- The Link tag points to
  `patch.msgid.link/20260203155532.1098290-3-shayne.chen@mediatek.com`
  confirming it's patch 3 of a series.
- Merged by Felix Fietkau (mt76 maintainer) which implies review and
  acceptance.
- Record: [Could not access lore] [Patch merged by subsystem maintainer]

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Key Functions**
- `mt76_mcu_skb_send_and_get_msg()` - core MCU send/receive path for all
  mt76 drivers
- `mt7996_mcu_send_message()` - mt7996-specific TXD preparation and send

**Step 5.2: Callers**
- `mt76_mcu_skb_send_and_get_msg` is called from
  `mt76_mcu_send_and_get_msg()` and `mt76_mcu_skb_send_msg()` (inline
  wrapper). These are the primary MCU command interfaces used throughout
  all mt76 drivers.
- Record: [Core MCU path, called from dozens of locations in all mt76
  drivers]

**Step 5.4: Call Chain for wait_resp=false**
- `__mt76_mcu_send_firmware` → `mt76_mcu_send_msg(... false)` →
  `mt76_mcu_skb_send_and_get_msg(... false)` → `mcu_skb_send_msg(...,
  NULL)`
- Firmware scatter commands skip TXD option setup via `goto exit`, so
  those are unaffected.
- Record: [Currently, no mt7996 UNI commands are sent with
  wait_resp=false in this tree, but the fix is architecturally correct]

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code Exists in Stable**
- mt7996 driver was introduced in v6.2 (commit `98686cd21624c7`)
- The buggy ACK-always-on pattern exists in all kernels since v6.2
- Record: [Present in stable trees 6.6.y and later]

**Step 6.2: Backport Complications**
- The mcu.c change should apply cleanly (context is stable since 2024).
- The mt7996/mcu.c change context includes the SDO special case
  (`dab5b2025452f9`), which was merged in v6.14 cycle. For older stable
  trees (6.6.y, 6.1.y), this SDO commit may not exist, requiring minor
  context adjustment.
- Record: [Clean apply for 7.0; may need minor adaptation for older
  stables]

## PHASE 7: SUBSYSTEM CONTEXT

- Subsystem: wifi (drivers/net/wireless/mediatek/mt76) - WiFi driver
- Criticality: IMPORTANT - mt76 is a widely-used WiFi chipset family
  (MediaTek)
- mt7996 is the Wi-Fi 7 (802.11be) driver, relatively new but growing
  user base
- Record: [IMPORTANT subsystem; growing user base for mt7996]

## PHASE 8: IMPACT AND RISK

**Step 8.1: Affected Users** - mt7996/mt7992 WiFi users (Wi-Fi 7
hardware)

**Step 8.2: Trigger Conditions** - Multiple MCU commands without
response need to be sent. Currently the SDO case is already fixed
separately. The broader fix is defensive/architectural.

**Step 8.3: Failure Mode** - MCU command/response mismatch → WiFi driver
malfunction, potential command timeouts. Severity: MEDIUM-HIGH (not a
crash/panic, but WiFi stops working correctly).

**Step 8.4: Risk-Benefit**
- Benefit: MEDIUM - fixes architectural correctness issue, prevents
  class of MCU communication bugs
- Risk: LOW - changes are minimal, all implementations verified to
  handle NULL safely
- Record: [Medium benefit, low risk = favorable ratio]

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes a real architectural bug in MCU event handling
- Small and contained (2 files, ~10 lines)
- Obviously correct (verified logic equivalence for all option values)
- All 11 `mcu_skb_send_msg` implementations handle NULL safely
- Merged by subsystem maintainer (Felix Fietkau)
- Authors are MediaTek vendor engineers who understand the hardware
- Bug present since mt7996 introduction (v6.2)
- Subsumes the SDO-specific band-aid fix

**Evidence AGAINST backporting:**
- Part of a series (patch 3/N) - unclear if fully standalone for all
  scenarios
- Immediate practical impact limited: SDO case already fixed separately
- No user reports or syzbot triggers documented
- Changes the interface contract for `mcu_skb_send_msg` across all mt76
  drivers
- Broader fix is somewhat "preventive" for future commands that may be
  added

**Stable Rules Checklist:**
1. Obviously correct? YES (verified logic equivalence)
2. Fixes real bug? YES (orphaned events can corrupt MCU command
   handling)
3. Important? MEDIUM (WiFi malfunction, not crash)
4. Small and contained? YES (~10 lines, 2 files)
5. No new features? YES (pure bug fix)
6. Applies to stable? YES for 7.0 (SDO commit context matches)

**Verification:**
- [Phase 1] Parsed subject: "avoid" verb indicates bug prevention; patch
  3 of series
- [Phase 2] Diff: 1 line in mcu.c (conditional NULL), ~8 lines
  restructured in mt7996/mcu.c
- [Phase 2] Logic equivalence verified: MCU_CMD_UNI|MCU_CMD_ACK =
  MCU_CMD_UNI_QUERY_ACK (0x3), MCU_CMD_UNI|MCU_CMD_SET|MCU_CMD_ACK =
  MCU_CMD_UNI_EXT_ACK (0x7)
- [Phase 3] git blame: mcu.c line from e452c6eb55fbfd (2020), mt7996
  from 98686cd21624c7 (2022)
- [Phase 3] SDO fix dab5b2025452f9 confirmed in 7.0 tree
- [Phase 3] mt7996 mcu.c option code originates from initial driver
  (98686cd21624c7)
- [Phase 5] All 11 mcu_skb_send_msg implementations verified to handle
  NULL wait_seq safely (subagent confirmed no NULL dereferences)
- [Phase 5] No mt76_mcu_skb_send_msg(..., false) calls found in mt7996
  for UNI commands; firmware scatter skips TXD
- [Phase 6] mt7996 exists in stable 6.6.y and later
- [Phase 8] Failure: MCU response mismatch → WiFi malfunction, severity
  MEDIUM-HIGH
- UNVERIFIED: Could not access lore.kernel.org to check full series
  context and reviewer feedback

The fix is architecturally correct and addresses a real design flaw in
the MCU command path. While the immediate practical impact is somewhat
mitigated by the existing SDO fix, this is the proper general solution
that prevents the entire class of orphaned ACK events. The change is
small, safe, and merged by the subsystem maintainer. The risk is very
low given all implementations handle the NULL parameter correctly.

**YES**

 drivers/net/wireless/mediatek/mt76/mcu.c        |  2 +-
 drivers/net/wireless/mediatek/mt76/mt7996/mcu.c | 11 +++++------
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mcu.c b/drivers/net/wireless/mediatek/mt76/mcu.c
index 535c3d8a9cc0d..cbfb3bbec5031 100644
--- a/drivers/net/wireless/mediatek/mt76/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mcu.c
@@ -98,7 +98,7 @@ int mt76_mcu_skb_send_and_get_msg(struct mt76_dev *dev, struct sk_buff *skb,
 	/* orig skb might be needed for retry, mcu_skb_send_msg consumes it */
 	if (orig_skb)
 		skb_get(orig_skb);
-	ret = dev->mcu_ops->mcu_skb_send_msg(dev, skb, cmd, &seq);
+	ret = dev->mcu_ops->mcu_skb_send_msg(dev, skb, cmd, wait_resp ? &seq : NULL);
 	if (ret < 0)
 		goto out;
 
diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
index 54776f0703876..0abe5efa9424e 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
@@ -322,13 +322,12 @@ mt7996_mcu_send_message(struct mt76_dev *mdev, struct sk_buff *skb,
 		uni_txd->pkt_type = MCU_PKT_ID;
 		uni_txd->seq = seq;
 
-		if (cmd & __MCU_CMD_FIELD_QUERY)
-			uni_txd->option = MCU_CMD_UNI_QUERY_ACK;
-		else
-			uni_txd->option = MCU_CMD_UNI_EXT_ACK;
+		uni_txd->option = MCU_CMD_UNI;
+		if (!(cmd & __MCU_CMD_FIELD_QUERY))
+			uni_txd->option |= MCU_CMD_SET;
 
-		if (mcu_cmd == MCU_UNI_CMD_SDO)
-			uni_txd->option &= ~MCU_CMD_ACK;
+		if (wait_seq)
+			uni_txd->option |= MCU_CMD_ACK;
 
 		if ((cmd & __MCU_CMD_FIELD_WA) && (cmd & __MCU_CMD_FIELD_WM))
 			uni_txd->s2d_index = MCU_S2D_H2CN;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] phy: phy-mtk-tphy: Update names and format of kernel-doc comments
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
  2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.19] wifi: mt76: avoid to set ACK for MCU command if wait_resp is not set Sasha Levin
@ 2026-04-20 13:17 ` Sasha Levin
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] Bluetooth: btmtk: add MT7902 MCU support Sasha Levin
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:17 UTC (permalink / raw)
  To: patches, stable
  Cc: Vinod Koul, Sasha Levin, chunfeng.yun, matthias.bgg,
	angelogioacchino.delregno, linux-arm-kernel, linux-mediatek,
	linux-phy, linux-kernel

From: Vinod Koul <vkoul@kernel.org>

[ Upstream commit 8d869bc943cfe5db08f5aff355b1d8d3abeda865 ]

mtk_phy_pdata documentation does not use correct tag for struct, while at
it fix one of member wrongly documented.

Warning: drivers/phy/mediatek/phy-mtk-tphy.c:289 cannot understand function prototype: 'struct mtk_phy_pdata'
Warning: drivers/phy/mediatek/phy-mtk-tphy.c:296 struct member 'slew_ref_clock_mhz' not described in 'mtk_phy_pdata'

Link: https://patch.msgid.link/20260223071032.408425-1-vkoul@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Based on my investigation, I have enough context to provide a thorough
analysis.

## Analysis

### PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1** - Subject Line Parse:
- Record: [phy/phy-mtk-tphy] [Update] [kernel-doc comment formatting and
  member name correction]

**Step 1.2** - Tags:
- Link:
  https://patch.msgid.link/20260223071032.408425-1-vkoul@kernel.org
- Signed-off-by: Vinod Koul <vkoul@kernel.org> (phy subsystem
  maintainer)
- No Fixes: tag, no Cc: stable, no Reported-by, no Tested-by, no
  Reviewed-by

**Step 1.3** - Commit Body:
- Bug described: Two kernel-doc warnings:
  1. `cannot understand function prototype: 'struct mtk_phy_pdata'`
     (missing "struct" tag)
  2. `struct member 'slew_ref_clock_mhz' not described` (doc says
     `slew_ref_clk_mhz` but the actual struct member is
     `slew_ref_clock_mhz`)
- Failure mode: doc generation warnings; no runtime impact

**Step 1.4** - Hidden bug fix detection:
- Record: This is NOT a hidden bug fix. It is a pure kernel-
  doc/documentation correctness fix. No runtime behavior changes.

### PHASE 2: DIFF ANALYSIS

**Step 2.1** - Inventory:
- Files: `drivers/phy/mediatek/phy-mtk-tphy.c` (1 file)
- Lines changed: 2 lines modified (comment only)
- Functions: None (only a struct's kernel-doc block)
- Scope: single-file, surgical, comments only

**Step 2.2** - Flow change:
- Before: `mtk_phy_pdata - SoC...` and `@slew_ref_clk_mhz:` in comments
- After: `struct mtk_phy_pdata - SoC...` and `@slew_ref_clock_mhz:` in
  comments
- No executable code changed

**Step 2.3** - Bug mechanism:
- Category: Documentation correctness. The kernel-doc parser rejects the
  struct doc block because it lacks the `struct` keyword, and then flags
  the unmatched member name.

**Step 2.4** - Fix quality:
- Obviously correct (just comment text)
- Zero regression risk (no runtime code)

### PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1** - Blame:
- `9cc82c2498b4f` ("phy: mediatek: tphy: Clarify and add kerneldoc to
  mtk_phy_pdata"): first added the kerneldoc without the `struct`
  prefix. Landed in v6.17-rc1.
- `d6306fc5d77b7` ("phy: mediatek: tphy: Cleanup and document slew
  calibration"): introduced the `slew_ref_clk_mhz` doc line while naming
  the actual member `slew_ref_clock_mhz`. Landed in v6.17-rc1.

**Step 3.2** - No Fixes: tag. Effectively the fix addresses both commits
above.

**Step 3.3** - Related changes: None relevant; no dependency or series.

**Step 3.4** - Author: Vinod Koul is the phy subsystem maintainer. High
trust.

**Step 3.5** - Dependencies: None. Standalone 2-line comment change.

### PHASE 4: MAILING LIST RESEARCH

- Lore fetch attempted but blocked by Anubis. The Link: tag points to
  vkoul@kernel.org posting.
- Record: Patch was posted on Feb 23 2026 by the subsystem maintainer.
  No evidence of controversy.

### PHASE 5: CODE SEMANTIC ANALYSIS

- Only a comment block is changed; the struct itself and all callers are
  unaffected. No reachability change.

### PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS

**Step 6.1** - Buggy code presence:
- Verified present in stable/linux-6.17.y, 6.18.y, 6.19.y (checked files
  directly; same problematic kerneldoc block exists in all three).
- Not present in 6.12.y and older (the kerneldoc block wasn't added
  there).

**Step 6.2** - Backport complications:
- The diff applies against the exact same surrounding context in 6.17.y,
  6.18.y, 6.19.y. Trivial clean apply.

**Step 6.3** - No prior fix found in stable branches.

### PHASE 7: SUBSYSTEM CONTEXT

- Subsystem: drivers/phy/mediatek (PERIPHERAL - MediaTek SoC-specific
  T-PHY)
- Author is the subsystem maintainer

### PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1** - Affected: Only developers generating kernel docs. No end-
user runtime impact whatsoever.

**Step 8.2** - Trigger: Running `make htmldocs` or similar against the
file.

**Step 8.3** - Severity: LOW. Pure cosmetic/doc-build warnings. The
struct's `slew_ref_clock_mhz` member appears as "undocumented" in
generated docs, but no functional issue.

**Step 8.4** - Benefit vs Risk:
- Benefit: Silences two kernel-doc warnings; correct generated
  documentation.
- Risk: Essentially zero (comment text change).

### PHASE 9: FINAL SYNTHESIS

**Evidence FOR:**
- Precedent: recent kernel-doc-only warning fixes have been AUTOSEL'd to
  stable (e.g., `45285d357287b` backlight lp855x, `a1bf8ea50cbc4`
  kfence, `18b6574d4c647` net/bridge, `c2277e285923b` scatterlist,
  `1cce5a5ecafeb` mtd/docg3) with similar "struct member X not
  described" style warnings
- Stable kernel rules explicitly list documentation/comment fixes as
  allowed (zero runtime regression risk)
- Small and surgical (2 line comment diff)
- Buggy doc exists in active stable trees (6.17+)
- Author is the subsystem maintainer
- Patch applies cleanly to 6.17.y / 6.18.y / 6.19.y

**Evidence AGAINST:**
- No Fixes: tag, no Cc: stable
- Pure cosmetic - no user-visible impact
- Does not fix a crash, leak, race, security issue, or data corruption
- Fails the "important issue" criterion strictly

**Stable rules check:**
1. Obviously correct: YES (trivial comment fix)
2. Fixes a real bug: Marginal (kerneldoc warning only)
3. Important issue: NO by strict reading, but in the recognized
   exception category (documentation fix)
4. Small and contained: YES (2 lines)
5. No new features: YES
6. Applies to stable: YES

**Exception category:** Falls under "Documentation and comment fixes" -
explicitly allowed by stable rules as zero-risk.

### Verification:
- [Phase 1] Parsed tags: Link tag only; no Fixes, Reported-by, Cc
  stable, Reviewed-by/Acked-by in this particular submission.
- [Phase 2] Read the diff: verified only 2 comment lines change (adds
  "struct " and renames `slew_ref_clk_mhz` -> `slew_ref_clock_mhz` in
  kerneldoc).
- [Phase 3] `git log -- drivers/phy/mediatek/phy-mtk-tphy.c`: confirmed
  `9cc82c2498b4f` added the kerneldoc without "struct" prefix;
  `d6306fc5d77b7` added the member-name mismatch.
- [Phase 3] `git describe --contains` for `9cc82c2498b4f` and
  `d6306fc5d77b7`: both resolve to v6.17-rc1.
- [Phase 3] `git branch -r --contains d6306fc5d77b7`: present in
  stable/linux-6.17.y, 6.18.y, 6.19.y; not in older stable.
- [Phase 4] WebFetch of lore URL blocked by Anubis (UNVERIFIED); could
  not examine review thread directly.
- [Phase 6] `git show stable/linux-6.17.y:drivers/phy/mediatek/phy-mtk-
  tphy.c` and the 6.18.y / 6.19.y equivalents: confirmed the same broken
  kerneldoc block is present in all three active stable branches; the
  struct member is `slew_ref_clock_mhz` in all of them; diff will apply
  cleanly.
- [Phase 6] Searched for prior fix in stable branches (`git log
  --grep="slew_ref"`): none found.
- [Phase 6] Verified similar AUTOSEL precedent: `git show
  45285d357287b`, `a1bf8ea50cbc4`, `18b6574d4c647`, `c2277e285923b`,
  `1cce5a5ecafeb` - all are kernel-doc-only warning fixes that were
  AUTOSEL'd by Sasha Levin into stable.
- [Phase 7] Author identity: Vinod Koul is the phy subsystem maintainer
  (confirmed via MAINTAINERS history).
- [Phase 8] Failure mode: kernel-doc build warnings only; no runtime
  effect - verified by reading the diff (no executable code changed).
- UNVERIFIED: Could not read mailing list thread (Anubis block); did not
  verify reviewer feedback or stable nominations directly.

### Decision

The change is a pure kernel-doc warning fix: it adds the missing
`struct` keyword and corrects a member-name typo so the kernel-doc
parser emits the correct docs for `struct mtk_phy_pdata`. The broken
kerneldoc exists in 6.17.y, 6.18.y and 6.19.y stable trees, the patch
applies trivially, and the risk of regression is effectively zero (no
executable code is touched). This falls squarely in the
"Documentation/comment fixes" exception category explicitly allowed by
stable kernel rules, and there is recent AUTOSEL precedent for
backporting similar kerneldoc-only warning fixes (lp855x, kfence,
bridge, scatterlist, docg3). The strongest argument against is that it
fails the strict "important issue" test, but the risk/benefit and
precedent tip it to YES.

**YES**

 drivers/phy/mediatek/phy-mtk-tphy.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/phy/mediatek/phy-mtk-tphy.c b/drivers/phy/mediatek/phy-mtk-tphy.c
index f6504e0ecd1a7..acf5065295072 100644
--- a/drivers/phy/mediatek/phy-mtk-tphy.c
+++ b/drivers/phy/mediatek/phy-mtk-tphy.c
@@ -276,14 +276,14 @@ enum mtk_phy_version {
 };
 
 /**
- * mtk_phy_pdata - SoC specific platform data
+ * struct mtk_phy_pdata - SoC specific platform data
  * @avoid_rx_sen_degradation: Avoid TX Sensitivity level degradation (MT6795/8173 only)
  * @sw_pll_48m_to_26m:        Workaround for V3 IP (MT8195) - switch the 48MHz PLL from
  *                            fractional mode to integer to output 26MHz for U2PHY
  * @sw_efuse_supported:       Switches off eFuse auto-load from PHY and applies values
  *                            read from different nvmem (usually different eFuse array)
  *                            that is pointed at in the device tree node for this PHY
- * @slew_ref_clk_mhz:         Default reference clock (in MHz) for slew rate calibration
+ * @slew_ref_clock_mhz:         Default reference clock (in MHz) for slew rate calibration
  * @slew_rate_coefficient:    Coefficient for slew rate calibration
  * @version:                  PHY IP Version
  */
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] Bluetooth: btmtk: add MT7902 MCU support
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
  2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.19] wifi: mt76: avoid to set ACK for MCU command if wait_resp is not set Sasha Levin
  2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.18] phy: phy-mtk-tphy: Update names and format of kernel-doc comments Sasha Levin
@ 2026-04-20 13:18 ` Sasha Levin
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: flush pending TX before channel switch Sasha Levin
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Sean Wang, Luiz Augusto von Dentz, Sasha Levin, marcel,
	luiz.dentz, matthias.bgg, angelogioacchino.delregno,
	linux-bluetooth, linux-kernel, linux-arm-kernel, linux-mediatek

From: Sean Wang <sean.wang@mediatek.com>

[ Upstream commit aab25984e55972e53f3e58821cb85a7101876056 ]

Add MT7902 device ID and firmware filename to enable MCU firmware
loading.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information for a thorough analysis. Let me compile my
findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: PARSE THE SUBJECT LINE**
Record: [Bluetooth: btmtk] [add] [MT7902 MCU support - adds device ID
and firmware filename for MT7902 chip]

**Step 1.2: PARSE ALL COMMIT MESSAGE TAGS**
- Signed-off-by: Sean Wang <sean.wang@mediatek.com> - Author, MediaTek
  employee, regular Bluetooth/WiFi contributor
- Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> -
  Bluetooth subsystem maintainer, applied the patch
- No Fixes: tag (expected for this type of commit)
- No Reported-by: (this is hardware enablement, not a bug report)
- No Cc: stable (expected)

Record: Author is MediaTek developer. Applied by Bluetooth subsystem
maintainer.

**Step 1.3: ANALYZE THE COMMIT BODY TEXT**
The commit message is straightforward: "Add MT7902 device ID and
firmware filename to enable MCU firmware loading." This is new hardware
enablement for the MT7902 Bluetooth chip.

Record: No bug description - this is a device ID addition for hardware
enablement.

**Step 1.4: DETECT HIDDEN BUG FIXES**
Without this patch, MT7902 devices that are matched by the vendor
wildcard USB entry `USB_VENDOR_AND_INTERFACE_INFO(0x0e8d, 0xe0, 0x01,
0x01)` will hit the `default:` case in `btmtk_usb_setup()` and return
-ENODEV with "Unsupported hardware variant". This effectively makes the
hardware non-functional.

Record: This is a hardware enablement commit, not a hidden bug fix. But
it prevents -ENODEV for real hardware.

---

## PHASE 2: DIFF ANALYSIS

**Step 2.1: INVENTORY THE CHANGES**
- `drivers/bluetooth/btmtk.h`: +1 line (FIRMWARE_MT7902 define)
- `drivers/bluetooth/btmtk.c`: +1 line (case 0x7902: in switch)
- Total: 2 lines added, 0 removed
- Functions modified: `btmtk_usb_setup()` (new case label in switch)
- Scope: single-file surgical addition

Record: 2 files, 2 lines added, scope is minimal.

**Step 2.2: UNDERSTAND THE CODE FLOW CHANGE**
- Before: `btmtk_usb_setup()` switch on dev_id has cases for 0x7663,
  0x7668, 0x7922, 0x7925, 0x7961. Device ID 0x7902 falls to `default:`
  -> returns -ENODEV.
- After: 0x7902 falls through to the same path as 0x7922/0x7925/0x7961,
  which calls `btmtk_fw_get_filename()` to generate firmware name and
  `btmtk_setup_firmware_79xx()` to load it.

Record: Adds a case label to fall through to existing firmware loading
code. No new execution paths.

**Step 2.3: IDENTIFY THE BUG MECHANISM**
Category: Hardware workaround / Device ID addition.
The change adds chip ID 0x7902 to a switch statement and a firmware
filename define. The firmware name generation function
`btmtk_fw_get_filename()` already handles 0x7902 correctly via its
`else` branch, producing `"mediatek/BT_RAM_CODE_MT7902_1_%x_hdr.bin"`.

Record: Device ID addition pattern. Existing code infrastructure handles
0x7902 without changes.

**Step 2.4: ASSESS THE FIX QUALITY**
- Obviously correct: new case label falls through to identical handling
  as 0x7922/0x7925/0x7961
- Minimal and surgical: 2 lines
- Regression risk: essentially zero - this code path was unreachable
  before (would hit default case)
- No red flags

Record: Trivially correct, zero regression risk.

---

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: BLAME THE CHANGED LINES**
- `case 0x7922:` introduced by Chris Lu in 5c5e8c52e3cafa (2024-07-04) -
  the btmtk refactoring commit
- `case 0x7961:` introduced by Hao Qin in a7208610761ae9 (2025-01-10) -
  same pattern of adding device ID
- The switch statement and firmware loading infrastructure have been in
  the tree since mid-2024

Record: Code infrastructure stable since mid-2024. Existing device IDs
added via same pattern.

**Step 3.2: FOLLOW THE FIXES TAG**
No Fixes: tag present (expected for device ID additions).

**Step 3.3: CHECK FILE HISTORY**
Recent changes to btmtk.c are mostly refactoring (btusb -> btmtk moves)
and bug fixes (UAF, shutdown timeout). The device ID infrastructure is
stable.

Record: Standalone commit, no prerequisites needed.

**Step 3.4: CHECK AUTHOR**
Sean Wang is a MediaTek developer, regular contributor to both Bluetooth
and WiFi subsystems. Multiple recent commits in drivers/bluetooth/.

Record: Author is domain expert from the hardware vendor.

**Step 3.5: CHECK FOR DEPENDENT/PREREQUISITE COMMITS**
This is patch 2/4 in a series, but it is standalone for USB devices. The
other patches add SDIO device ID (1/4), USB VID/PID for third-party
module (3/4), and SDIO support code (4/4). This patch is sufficient for
USB devices matched by the vendor wildcard
`USB_VENDOR_AND_INTERFACE_INFO(0x0e8d, ...)`.

Record: Standalone for USB devices via vendor wildcard matching.

---

## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH

**Step 4.1: FIND THE ORIGINAL PATCH DISCUSSION**
Found via spinics.net. This is [PATCH 2/4] in Sean Wang's MT7902 series
posted 2026-02-19. The series includes:
1. mmc: sdio: add MediaTek MT7902 SDIO device ID
2. Bluetooth: btmtk: add MT7902 MCU support (THIS commit)
3. Bluetooth: btusb: Add new VID/PID 13d3/3579 for MT7902
4. Bluetooth: btmtk: add MT7902 SDIO support

Record: Part of 4-patch series. This specific patch is standalone for
USB via vendor wildcard.

**Step 4.2: CHECK WHO REVIEWED**
Applied by Luiz Augusto von Dentz, the Bluetooth subsystem maintainer.
Sent to linux-bluetooth and linux-mediatek mailing lists.

Record: Applied by subsystem maintainer.

**Step 4.3: SEARCH FOR BUG REPORT**
No specific bug report - this is proactive hardware enablement by the
chip vendor.

**Step 4.4: RELATED PATCHES**
A separate patch from OnlineLearningTutorials also attempted to add
MT7902 USB IDs (with the same case 0x7902 addition). This confirms real
user demand for MT7902 support.

Record: Multiple independent submissions for MT7902 support indicate
real hardware availability.

**Step 4.5: STABLE MAILING LIST**
No specific stable discussion found.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: KEY FUNCTIONS**
Modified: `btmtk_usb_setup()` - only a new case label added.

**Step 5.2: TRACE CALLERS**
`btmtk_usb_setup()` <- `btusb_mtk_setup()` <- assigned to `hdev->setup`
for all BTUSB_MEDIATEK devices. Called during device initialization for
every MediaTek Bluetooth USB device.

**Step 5.3-5.4: CALL CHAIN**
USB device probes -> btusb_probe() -> sets hdev->setup = btusb_mtk_setup
-> HCI core calls hdev->setup() -> btusb_mtk_setup() ->
btmtk_usb_setup() -> switch(dev_id). This is a standard device
initialization path, triggered on every device connection.

**Step 5.5: SIMILAR PATTERNS**
The same pattern is used for MT7922, MT7925, MT7961 - all case labels in
the same switch with identical fall-through behavior.

---

## PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS

**Step 6.1: DOES THE BUGGY CODE EXIST IN STABLE?**
The switch statement and all 79xx case labels exist in the 7.0 tree. The
firmware loading infrastructure is present. Only the 0x7902 case is
missing.

Record: Infrastructure exists in stable. Only the device ID is missing.

**Step 6.2: BACKPORT COMPLICATIONS**
The diff context shows a retry mechanism (`BTMTK_FIRMWARE_DL_RETRY`)
that doesn't exist in the 7.0 tree. The patch will need minor context
adjustment for the btmtk.c hunk. The btmtk.h hunk applies cleanly.

Record: Minor context conflict expected; trivial manual resolution
needed.

**Step 6.3: RELATED FIXES IN STABLE**
No MT7902 support exists in stable at all.

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

**Step 7.1: SUBSYSTEM CRITICALITY**
Bluetooth (drivers/bluetooth/) - IMPORTANT subsystem. Bluetooth is
widely used in laptops, phones, and IoT devices.

**Step 7.2: SUBSYSTEM ACTIVITY**
Actively developed with regular commits. Device ID additions are a
common pattern.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: WHO IS AFFECTED**
Users with MT7902 Bluetooth hardware. The MT7902 is a MediaTek wireless
chip used in laptops and embedded devices. It appears to be a
recent/current-generation chip.

**Step 8.2: TRIGGER CONDITIONS**
Triggered when a user has MT7902 hardware and the device is enumerated
via USB. The vendor wildcard USB entry matches MediaTek devices, so the
driver binds but fails at firmware loading without this patch.

**Step 8.3: FAILURE MODE**
Without this patch: `bt_dev_err(hdev, "Unsupported hardware variant
(%08x)")` and return -ENODEV. Bluetooth is completely non-functional for
these devices.

Record: Severity: MEDIUM-HIGH (complete loss of Bluetooth functionality
for affected hardware).

**Step 8.4: RISK-BENEFIT RATIO**
- BENEFIT: Enables Bluetooth for MT7902 hardware users on stable
  kernels. High benefit.
- RISK: 2 lines, falls through to well-tested existing code path.
  Essentially zero risk.

Record: Very high benefit-to-risk ratio.

---

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: EVIDENCE**

FOR backporting:
- Classic device ID addition to existing driver (explicit exception
  category)
- Only 2 lines changed
- Falls through to well-tested code path (same as MT7922/MT7925/MT7961)
- Author is the chip vendor's engineer
- Applied by Bluetooth subsystem maintainer
- Enables real hardware that would otherwise be completely non-
  functional
- Multiple independent submissions for MT7902 confirm real user demand
- Zero regression risk (new case label, previously unreachable code
  path)

AGAINST backporting:
- Part of a 4-patch series (but this patch is standalone for USB via
  vendor wildcard)
- Minor context conflict may need resolution for btmtk.c hunk
- No Fixes: tag or Cc: stable (expected for device ID additions)

**Step 9.2: STABLE RULES CHECKLIST**
1. Obviously correct and tested? YES - trivial fall-through case label
2. Fixes a real bug? YES - enables hardware that returns -ENODEV
3. Important issue? YES - complete Bluetooth non-functionality for
   affected hardware
4. Small and contained? YES - 2 lines across 2 files
5. No new features or APIs? YES - just enables existing infrastructure
   for new chip ID
6. Can apply to stable? YES - with minor context adjustment

**Step 9.3: EXCEPTION CATEGORY**
YES - Device ID addition to existing driver. This is an explicitly
allowed exception.

---

## Verification

- [Phase 1] Parsed tags: Sean Wang (MediaTek) authored, Luiz von Dentz
  (BT maintainer) applied
- [Phase 2] Diff analysis: +1 case label in switch, +1 firmware define.
  2 lines total.
- [Phase 3] git blame: switch infrastructure from 5c5e8c52e3cafa
  (2024-07-04), present in stable
- [Phase 3] git log: confirmed 0x7902 does not exist anywhere in the
  current tree
- [Phase 4] Found original submission on spinics.net: [PATCH 2/4] in
  MT7902 series dated 2026-02-19
- [Phase 4] Multiple submissions: both Sean Wang (MediaTek) and an
  independent user submitted MT7902 support
- [Phase 5] Traced call chain: USB probe -> btusb_probe ->
  btusb_mtk_setup -> btmtk_usb_setup -> switch(dev_id)
- [Phase 5] Verified `btmtk_fw_get_filename()` handles 0x7902 via
  existing `else` branch
- [Phase 5] Verified vendor wildcard USB entry
  `USB_VENDOR_AND_INTERFACE_INFO(0x0e8d, ...)` matches MT7902
- [Phase 6] Confirmed switch statement and firmware loading
  infrastructure exist in 7.0 tree
- [Phase 6] Identified minor context conflict: retry mechanism in diff
  not present in stable
- [Phase 8] Without patch: "Unsupported hardware variant" error, -ENODEV
  returned, Bluetooth non-functional

This is a textbook device ID addition - minimal, obviously correct,
enabling real hardware for real users on stable kernels, with
essentially zero regression risk.

**YES**

 drivers/bluetooth/btmtk.c | 1 +
 drivers/bluetooth/btmtk.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c
index 0ada5a12130dc..c56c9e077040d 100644
--- a/drivers/bluetooth/btmtk.c
+++ b/drivers/bluetooth/btmtk.c
@@ -1332,6 +1332,7 @@ int btmtk_usb_setup(struct hci_dev *hdev)
 	case 0x7922:
 	case 0x7925:
 	case 0x7961:
+	case 0x7902:
 		btmtk_fw_get_filename(fw_bin_name, sizeof(fw_bin_name), dev_id,
 				      fw_version, fw_flavor);
 
diff --git a/drivers/bluetooth/btmtk.h b/drivers/bluetooth/btmtk.h
index 5df7c32966247..b4506186b2f70 100644
--- a/drivers/bluetooth/btmtk.h
+++ b/drivers/bluetooth/btmtk.h
@@ -5,6 +5,7 @@
 #define FIRMWARE_MT7663		"mediatek/mt7663pr2h.bin"
 #define FIRMWARE_MT7668		"mediatek/mt7668pr2h.bin"
 #define FIRMWARE_MT7922		"mediatek/BT_RAM_CODE_MT7922_1_1_hdr.bin"
+#define FIRMWARE_MT7902		"mediatek/BT_RAM_CODE_MT7902_1_1_hdr.bin"
 #define FIRMWARE_MT7961		"mediatek/BT_RAM_CODE_MT7961_1_2_hdr.bin"
 #define FIRMWARE_MT7925		"mediatek/mt7925/BT_RAM_CODE_MT7925_1_1_hdr.bin"
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] wifi: mt76: flush pending TX before channel switch
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] Bluetooth: btmtk: add MT7902 MCU support Sasha Levin
@ 2026-04-20 13:18 ` Sasha Levin
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: fix list corruption in mt76_wcid_cleanup Sasha Levin
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Felix Fietkau, Sasha Levin, lorenzo, ryder.lee, matthias.bgg,
	angelogioacchino.delregno, linux-wireless, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: Felix Fietkau <nbd@nbd.name>

[ Upstream commit 0dcef1cbae27d806cd29c296cc03ad6e8ece771d ]

mt76_tx() queues frames on wcid->tx_pending for async processing by
tx_worker. In __mt76_set_channel(), the worker gets disabled before it
may have run, and the subsequent wait only checks DMA ring queues, not
the software pending list. This means frames like nullfunc PS frames
from mt76_offchannel_notify() may never be transmitted on the correct
channel.

Fix this by running mt76_txq_schedule_pending() synchronously after
disabling the tx_worker but before setting MT76_RESET, which would
otherwise cause mt76_txq_schedule_pending_wcid() to bail out.

Link: https://patch.msgid.link/20260309060730.87840-8-nbd@nbd.name
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Good. I now have a complete picture. Let me compile the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: wifi: mt76 (MediaTek WiFi driver)
- **Action verb**: "flush" — implies fixing missing work that should
  have been done
- **Summary**: Flush pending TX frames before channel switch to ensure
  they're sent on the correct channel

Record: [wifi: mt76] [flush] [Ensure pending TX frames are transmitted
before channel switch]

### Step 1.2: Tags
- **Link**:
  `https://patch.msgid.link/20260309060730.87840-8-nbd@nbd.name` — part
  of a series (patch 8)
- **Signed-off-by**: Felix Fietkau `<nbd@nbd.name>` — the mt76
  maintainer and original author

Record: No Fixes: tag (expected for autosel candidates). No Reported-by.
No Cc: stable. Author is subsystem maintainer. Part number "8" in msgid
suggests this is part of a series.

### Step 1.3: Commit Body Analysis
The commit clearly explains the bug:
- `mt76_tx()` queues frames on `wcid->tx_pending` for async processing
  by `tx_worker`
- In `__mt76_set_channel()`, the worker gets disabled *after*
  `MT76_RESET` is set
- `mt76_txq_schedule_pending_wcid()` bails out when `MT76_RESET` is set
  (line 626 of tx.c)
- The `wait_event_timeout` only checks DMA ring queues via
  `mt76_has_tx_pending()`, NOT the software pending list
- Result: Frames like nullfunc PS frames may never be transmitted on the
  correct channel

Record: Bug = TX frames lost during channel switch due to ordering issue
between MT76_RESET flag and tx_worker disable. Symptom = nullfunc power-
save frames not transmitted. Root cause = MT76_RESET set before
schedule_pending runs, causing bail-out.

### Step 1.4: Hidden Bug Fix Detection
This is an explicit bug fix, not disguised. The commit clearly describes
lost TX frames.

Record: Explicit bug fix — not a hidden fix.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **mac80211.c**: 3 lines reordered (moved `mt76_worker_disable` before
  `set_bit(MT76_RESET)`, added `mt76_txq_schedule_pending()` call)
- **mt76.h**: 1 line added (function declaration)
- **tx.c**: 1 line changed (`static void` → `void`)
- **Total**: ~5 lines of meaningful change
- **Functions modified**: `__mt76_set_channel()`,
  `mt76_txq_schedule_pending()` (visibility only)

Record: [3 files, ~5 lines changed] [__mt76_set_channel reordering,
mt76_txq_schedule_pending visibility] [Single-subsystem surgical fix]

### Step 2.2: Code Flow Change

**Before** (`__mt76_set_channel()`):
1. `set_bit(MT76_RESET, &phy->state)` — blocks
   `mt76_txq_schedule_pending_wcid`
2. `mt76_worker_disable(&dev->tx_worker)` — stops worker
3. `wait_event_timeout(... !mt76_has_tx_pending ...)` — only checks DMA
   queues

**After**:
1. `mt76_worker_disable(&dev->tx_worker)` — stops worker first
2. `mt76_txq_schedule_pending(phy)` — synchronously flush software
   pending list to DMA
3. `set_bit(MT76_RESET, &phy->state)` — now safe to set
4. `wait_event_timeout(... !mt76_has_tx_pending ...)` — DMA queues now
   include flushed frames

Record: Fix reorders operations so pending frames get flushed to DMA
rings before MT76_RESET blocks further processing.

### Step 2.3: Bug Mechanism
This is a **logic/ordering bug** leading to **TX frame loss**. The
`MT76_RESET` flag acts as a gate in `mt76_txq_schedule_pending_wcid()`
(line 626), and it was being set too early, preventing software-queued
frames from ever reaching the hardware.

Record: [Logic/ordering bug] [MT76_RESET set too early prevents software
TX queue flushing → frame loss]

### Step 2.4: Fix Quality
- Obviously correct: The reordering is logically sound — disable worker,
  flush pending, then set reset flag
- Minimal and surgical: ~5 lines changed
- Regression risk: Very low — the only new code path is calling
  `mt76_txq_schedule_pending()` synchronously, which already runs as
  part of `mt76_txq_schedule_all()` via the tx_worker. The worker is
  already disabled at this point, so no concurrency concern.

Record: [High quality fix, obviously correct, minimal scope, very low
regression risk]

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
The buggy code pattern (`set_bit` before `mt76_worker_disable`) was
introduced by:
- `f4fdd7716290a2` ("wifi: mt76: partially move channel change code to
  core") — v6.12
- `0b3be9d1d34e21` ("wifi: mt76: add separate tx scheduling queue for
  off-channel tx") — v6.12

These two commits together created the bug: one added the off-channel TX
pending mechanism, the other moved channel change code to core with the
wrong ordering.

Record: Bug introduced in v6.12 by commits f4fdd7716290a2 and
0b3be9d1d34e21.

### Step 3.2: Fixes Tag
No Fixes: tag present (expected for autosel). However, the bug clearly
traces back to `0b3be9d1d34e` — two related fixes (228bc0e79c852 and
49fba87205bec) already reference it with `Fixes:` tags.

Record: Related fixes 228bc0e79c852 and 49fba87205bec both fix
0b3be9d1d34e — this is a third fix for the same problematic commit.

### Step 3.3: File History
Related recent commits:
- `228bc0e79c852` (v6.14): "only enable tx worker after setting the
  channel" — Fixes: 0b3be9d1d34e
- `49fba87205bec`: "fix linked list corruption" — Fixes: 0b3be9d1d34e
- `bdeac7815629c`: "free pending offchannel tx frames on wcid cleanup"

Record: Multiple follow-up fixes to the same offchannel TX code. This
commit is standalone — only needs the pre-existing
mt76_txq_schedule_pending function.

### Step 3.4: Author
Felix Fietkau (`nbd@nbd.name`) is the mt76 subsystem maintainer and
original author of the driver. Very high confidence in fix correctness.

Record: Author is the mt76 maintainer — highest trust level.

### Step 3.5: Dependencies
- `mt76_txq_schedule_pending()` exists since v6.12 (commit 0b3be9d1d34e)
- `__mt76_set_channel()` exists since v6.14 (commit 82334623af0cd2)
- For v6.12 backport: function is called `mt76_set_channel()` with
  different context — needs adaptation
- For v6.14+/7.0: should apply cleanly or with minimal context
  adjustment
- Commit `228bc0e79c852` (v6.14) should ideally be present first, as it
  repositions `mt76_worker_enable()`. The v6.12 code has enable before
  `set_channel`, which was moved by that fix.

Record: Dependencies on 0b3be9d1d34e (present since v6.12) and
82334623af0cd2 (v6.14). For v6.12 backport, adaptation is needed.

---

## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH

### Step 4.1: Patch Discussion
b4 dig could not find the exact commit by the Link msgid. The msgid
suggests this is patch 8 of a series from Felix Fietkau dated
2026-03-09.

Record: Part of a larger series. Could not retrieve full lore discussion
due to lore anti-bot protections.

### Step 4.2: Reviewers
Felix Fietkau is both author and maintainer — self-reviewed. This is
normal for mt76 where he is the primary maintainer.

Record: Author is subsystem maintainer.

### Step 4.3-4.5: Bug Report
No Reported-by tag, no syzbot report. This appears to be found through
code review by the maintainer. No stable-specific discussion found.

Record: Found by maintainer code review.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.2: Key Functions
- `__mt76_set_channel()` — called from `mt76_set_channel()`
  (mac80211.c:1071) and `mt76_chanctx_assign_vif()` (channel.c:29)
- `mt76_set_channel()` is the mac80211 callback for channel changes
- This is called during every channel switch, scan, remain-on-channel —
  a **common operation**

### Step 5.3-5.4: Call Chain
`mac80211 → mt76_set_channel() → __mt76_set_channel()` — this is the
standard channel switch path, reachable during normal WiFi operation
(scanning, roaming, channel changes).

Record: Commonly triggered during WiFi scanning and channel switching.

### Step 5.5: Similar Patterns
The MT76_RESET bail-out pattern in `mt76_txq_schedule_pending_wcid()`
(line 626) is the direct cause. The same flag check exists in other TX
scheduling paths (lines 492, 546) which are also affected by the
ordering.

Record: MT76_RESET acts as gating mechanism in multiple TX paths.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Bug Existence in Stable
- **v6.6 and earlier**: Bug does NOT exist —
  `mt76_txq_schedule_pending()` and the offchannel TX mechanism weren't
  added until v6.12
- **v6.12**: Bug EXISTS — has both the offchannel TX pending mechanism
  and the wrong ordering in `mt76_set_channel()`
- **v6.14+**: Bug EXISTS — has `__mt76_set_channel()` with the wrong
  ordering

Record: Bug exists in v6.12+ stable trees.

### Step 6.2: Backport Complications
- **v7.0**: Should apply cleanly
- **v6.14**: Should apply cleanly or near-cleanly (function name same)
- **v6.12**: Needs adaptation — different function name
  (`mt76_set_channel` vs `__mt76_set_channel`), different surrounding
  code (mutex_lock, cancel_delayed_work), may also need 228bc0e79c852 as
  prerequisite

Record: Clean for v6.14+; needs rework for v6.12.

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1: Subsystem Criticality
WiFi driver (mt76) — **IMPORTANT**. MediaTek MT76xx chipsets are
extremely common in consumer routers, laptops (mt7921/mt7922), and
access points (mt7915, mt7996). This is one of the most widely used WiFi
driver families in Linux.

Record: [drivers/net/wireless/mediatek/mt76] [IMPORTANT — very common
WiFi hardware]

### Step 7.2: Activity
Very active subsystem with frequent fixes from the maintainer.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Users
All users of MT76-based WiFi devices (mt7603, mt76x0, mt76x2, mt7615,
mt7915, mt7921/mt7922, mt7996) — a very large user base including laptop
users, router users, and embedded systems.

Record: [Driver-specific but very widely used]

### Step 8.2: Trigger Conditions
- Triggered during any channel switch (scanning, roaming, offchannel
  operations)
- WiFi scanning happens automatically and frequently
- The bug window is when frames are queued on `tx_pending` just before a
  channel switch

Record: [Common trigger — scanning happens regularly on all WiFi
devices]

### Step 8.3: Failure Mode
- TX frame loss — nullfunc PS frames not transmitted → AP may not know
  client is going off-channel → potential packet loss, connectivity
  issues
- Not a crash, but a functional correctness issue affecting WiFi
  reliability

Record: [Failure mode: TX frame loss during channel switch] [Severity:
MEDIUM-HIGH — affects WiFi reliability]

### Step 8.4: Risk-Benefit
- **Benefit**: Fixes TX frame loss during channel switch on widely-used
  WiFi hardware. Medium-high benefit.
- **Risk**: Very low — 5 lines of reordering, obviously correct,
  authored by maintainer
- **Ratio**: Strong benefit, minimal risk

Record: [Benefit: medium-high] [Risk: very low] [Favorable ratio]

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence

**FOR backporting**:
- Fixes a real bug: TX frames lost during channel switch
- Small, surgical fix: ~5 lines of meaningful change
- Authored by the mt76 maintainer (Felix Fietkau)
- Affects widely-used WiFi hardware
- Common trigger (WiFi scanning)
- Obviously correct: just reordering operations and adding a synchronous
  flush
- Very low regression risk
- Related to same code that already has two other Fixes: tagged commits

**AGAINST backporting**:
- No Fixes: tag (expected for autosel)
- No Reported-by (found by maintainer review)
- Part of a larger series (patch 8) — but the fix is self-contained
- v6.12 backport would need adaptation
- Not a crash — "just" frame loss (but impacts WiFi reliability)

### Step 9.2: Stable Rules Checklist
1. Obviously correct? **YES** — ordering fix is straightforward
2. Fixes a real bug? **YES** — TX frame loss during channel switch
3. Important issue? **YES** — WiFi reliability on common hardware
4. Small and contained? **YES** — ~5 lines across 3 files in same
   subsystem
5. No new features? **YES** — just fixes ordering
6. Can apply to stable? **YES** for v6.14+; **needs rework** for v6.12

### Step 9.3: Exception Categories
Not an exception category — this is a straightforward bug fix.

### Step 9.4: Decision
The fix addresses a real TX frame loss bug during WiFi channel switching
on widely-used MT76 hardware. It is small, obviously correct, authored
by the subsystem maintainer, and carries minimal regression risk. The
bug affects WiFi reliability for a large user base.

---

## Verification

- [Phase 1] Parsed tags: Link to patch.msgid.link, Signed-off-by Felix
  Fietkau (mt76 maintainer)
- [Phase 2] Diff analysis: Reorders
  set_bit(MT76_RESET)/mt76_worker_disable, adds
  mt76_txq_schedule_pending() call, makes function non-static
- [Phase 2] Confirmed MT76_RESET bail-out at tx.c line 626:
  `test_bit(MT76_RESET, &phy->state)` causes schedule_pending_wcid to
  return -1
- [Phase 2] Confirmed mt76_has_tx_pending() only checks DMA queues
  (q->queued), not software pending list
- [Phase 3] git blame: `set_bit(MT76_RESET)` ordering from
  f4fdd7716290a2 (v6.12), offchannel pending from 0b3be9d1d34e21 (v6.12)
- [Phase 3] git show 228bc0e79c852: Related fix "only enable tx worker
  after setting the channel", Fixes: 0b3be9d1d34e (v6.14)
- [Phase 3] git show 49fba87205bec: Related fix "fix linked list
  corruption", Fixes: 0b3be9d1d34e
- [Phase 3] Confirmed __mt76_set_channel() introduced in 82334623af0cd2
  (v6.14)
- [Phase 3] Felix Fietkau confirmed as mt76 maintainer via git log
  --author
- [Phase 5] mt76_txq_schedule_pending called from mt76_txq_schedule_all
  → mt76_tx_worker_run → tx_worker — normal TX path
- [Phase 6] Confirmed mt76_txq_schedule_pending does NOT exist in v6.6
  (bug not present pre-v6.12)
- [Phase 6] Confirmed v6.12 has the same bug pattern (set_bit before
  worker_disable, no schedule_pending call)
- [Phase 6] v6.12 uses `mt76_set_channel()` not `__mt76_set_channel()` —
  backport needs adaptation
- UNVERIFIED: Could not access lore.kernel.org discussion thread due to
  anti-bot protections

**YES**

 drivers/net/wireless/mediatek/mt76/mac80211.c | 5 +++--
 drivers/net/wireless/mediatek/mt76/mt76.h     | 1 +
 drivers/net/wireless/mediatek/mt76/tx.c       | 2 +-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
index d0c522909e980..73d252e0a7bf3 100644
--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
+++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
@@ -1030,9 +1030,10 @@ int __mt76_set_channel(struct mt76_phy *phy, struct cfg80211_chan_def *chandef,
 	int timeout = HZ / 5;
 	int ret;
 
-	set_bit(MT76_RESET, &phy->state);
-
 	mt76_worker_disable(&dev->tx_worker);
+	mt76_txq_schedule_pending(phy);
+
+	set_bit(MT76_RESET, &phy->state);
 	wait_event_timeout(dev->tx_wait, !mt76_has_tx_pending(phy), timeout);
 	mt76_update_survey(phy);
 
diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
index d05e83ea1cacc..7bba0831bc0eb 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76.h
+++ b/drivers/net/wireless/mediatek/mt76/mt76.h
@@ -1518,6 +1518,7 @@ void mt76_stop_tx_queues(struct mt76_phy *phy, struct ieee80211_sta *sta,
 void mt76_tx_check_agg_ssn(struct ieee80211_sta *sta, struct sk_buff *skb);
 void mt76_txq_schedule(struct mt76_phy *phy, enum mt76_txq_id qid);
 void mt76_txq_schedule_all(struct mt76_phy *phy);
+void mt76_txq_schedule_pending(struct mt76_phy *phy);
 void mt76_tx_worker_run(struct mt76_dev *dev);
 void mt76_tx_worker(struct mt76_worker *w);
 void mt76_release_buffered_frames(struct ieee80211_hw *hw,
diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c
index 0753acf2eccb8..ab62591b7a260 100644
--- a/drivers/net/wireless/mediatek/mt76/tx.c
+++ b/drivers/net/wireless/mediatek/mt76/tx.c
@@ -660,7 +660,7 @@ mt76_txq_schedule_pending_wcid(struct mt76_phy *phy, struct mt76_wcid *wcid,
 	return ret;
 }
 
-static void mt76_txq_schedule_pending(struct mt76_phy *phy)
+void mt76_txq_schedule_pending(struct mt76_phy *phy)
 {
 	LIST_HEAD(tx_list);
 	int ret = 0;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.6] wifi: mt76: fix list corruption in mt76_wcid_cleanup
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (3 preceding siblings ...)
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: flush pending TX before channel switch Sasha Levin
@ 2026-04-20 13:18 ` Sasha Levin
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: add missing lock protection in mt76_sta_state for sta_event callback Sasha Levin
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Zac Bowling, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Zac Bowling <zac@zacbowling.com>

[ Upstream commit 34163942195410372fb138bea806c9b34e2f5257 ]

mt76_wcid_cleanup() was not removing wcid entries from sta_poll_list
before mt76_reset_device() reinitializes the master list. This leaves
stale pointers in wcid->poll_list, causing list corruption when
mt76_wcid_add_poll() later checks list_empty() and tries to add the
entry back.

The fix adds proper cleanup of poll_list in mt76_wcid_cleanup(),
matching how tx_list is already handled. This is similar to what
mt7996_mac_sta_deinit_link() already does correctly.

Fixes list corruption warnings like:
  list_add corruption. prev->next should be next (ffffffff...)

Signed-off-by: Zac Bowling <zac@zacbowling.com>
Link: https://patch.msgid.link/20260120201043.38225-3-zac@zacbowling.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

I now have all the information needed for the full analysis. Let me
compile the findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: wifi: mt76 (MediaTek WiFi driver)
- **Action verb**: "fix" - explicitly a bug fix
- **Summary**: Fix list corruption in `mt76_wcid_cleanup()`

Record: [wifi: mt76] [fix] [list corruption in mt76_wcid_cleanup causing
stale pointers after reset]

### Step 1.2: Tags
- **Signed-off-by**: Zac Bowling <zac@zacbowling.com> (author)
- **Link**:
  https://patch.msgid.link/20260120201043.38225-3-zac@zacbowling.com
  (original submission)
- **Signed-off-by**: Felix Fietkau <nbd@nbd.name> (mt76 subsystem
  maintainer - applied the patch)
- No Fixes: tag (expected for manual review candidates)
- No Cc: stable tag (expected)

Record: Patch was applied by subsystem maintainer Felix Fietkau, who is
the author of the surrounding code. This is a strong signal the fix is
correct.

### Step 1.3: Commit Body Analysis
The commit clearly explains the bug:
1. `mt76_wcid_cleanup()` does not remove wcid entries from
   `sta_poll_list`
2. `mt76_reset_device()` reinitializes the master `sta_poll_list` with
   `INIT_LIST_HEAD`
3. This leaves `wcid->poll_list` with stale prev/next pointers
4. When `mt76_wcid_add_poll()` later checks `list_empty()` and does
   `list_add_tail()`, list corruption occurs

**Symptom**: `list_add corruption. prev->next should be next
(ffffffff...)` - a kernel WARNING/BUG

Record: Clear list corruption bug during hardware restart. The failure
mode is a kernel list corruption warning, which indicates corrupted
linked list pointers. This can lead to crashes or undefined behavior.

### Step 1.4: Hidden Bug Fix Detection
This is NOT a hidden fix - it explicitly says "fix list corruption" and
describes the exact mechanism.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **Files changed**: 1 (`drivers/net/wireless/mediatek/mt76/mac80211.c`)
- **Lines added**: ~7 (5 lines of code + 4 lines of comment)
- **Lines removed**: 0
- **Functions modified**: `mt76_wcid_cleanup()`
- **Scope**: Single-file, single-function, surgical fix

Record: Very small, contained change. +10 lines (including comments),
single function.

### Step 2.2: Code Flow Change
**Before**: `mt76_wcid_cleanup()` cleaned up `tx_list`, `tx_pending`,
`tx_offchannel`, and `pktid` but NOT `poll_list`.

**After**: `mt76_wcid_cleanup()` also removes the wcid from
`sta_poll_list` using the proper `spin_lock_bh(&dev->sta_poll_lock)` /
`list_del_init()` pattern, matching how `tx_list` is handled (lines
1721-1722).

### Step 2.3: Bug Mechanism
This is a **list corruption / stale pointer bug**:
1. `mt76_reset_device()` calls `mt76_wcid_cleanup()` for each wcid (line
   848)
2. After the loop, it does `INIT_LIST_HEAD(&dev->sta_poll_list)` (line
   854) - reinitializes the list head
3. Any wcid still linked to `sta_poll_list` now has stale prev/next
   pointers
4. Later `mt76_wcid_add_poll()` (line 1747) checks `list_empty()` on the
   stale entry, gets a bogus result, and triggers list corruption when
   trying to add

The fix adds the missing cleanup. This matches the established pattern -
every other caller of `mt76_wcid_cleanup()` (mt7996, mt7915, mt792x,
mt7615, mt7603) removes the wcid from poll_list BEFORE calling
`mt76_wcid_cleanup()`. Only the `mt76_reset_device()` path was missing
this.

### Step 2.4: Fix Quality
- **Obviously correct**: Yes. It adds `list_del_init()` under the same
  lock, matching the exact pattern used by ALL individual driver callers
  and matching how `tx_list` is already handled in the same function.
- **Minimal**: Yes. 5 lines of code, 4 lines of comment.
- **Regression risk**: Very low. Adding a properly locked
  `list_del_init()` is safe. The `list_empty()` check prevents double-
  delete. The init ensures the poll_list is in a clean state.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
- `mt76_wcid_cleanup()` was introduced by commit `0335c034e7265d` (Felix
  Fietkau, 2023-08-29)
- `poll_list` initialization in `mt76_wcid_init` was added by
  `cbf5e61da66028` (Felix Fietkau, 2025-01-02)
- `mt76_wcid_add_poll()` was added by `387ab042ace87` (Felix Fietkau,
  2024-12-30, in v6.14)
- `mt76_reset_device()` was added by `065c79df595af` (Felix Fietkau,
  2025-08-27, in v6.17)

The bug was introduced when `065c79df595af` added `mt76_reset_device()`
which calls `mt76_wcid_cleanup()` then reinitializes `sta_poll_list`
without first removing entries.

### Step 3.2: Fixes Tag
No Fixes: tag. Based on analysis, should reference `065c79df595af`
("wifi: mt76: mt7915: fix list corruption after hardware restart") which
introduced `mt76_reset_device()`.

### Step 3.3: Related Changes
- `065c79df595af` - mt7915 list corruption fix (introduced
  mt76_reset_device, paradoxically introducing THIS bug)
- `a3c99ef88a084` - do not add non-sta wcid entries to the poll list
- `ace5d3b6b49e8` - mt7996 hardware restart reliability (uses
  mt76_reset_device)
- `328e35c7bfc67` - mt7915 hardware restart reliability

### Step 3.4: Author
Zac Bowling is not a regular mt76 contributor (only 1 commit found).
However, the patch was accepted and signed by Felix Fietkau
(nbd@nbd.name), who is the mt76 subsystem maintainer and authored ALL
the surrounding code.

### Step 3.5: Dependencies
The fix is standalone. It only uses `dev->sta_poll_lock`,
`wcid->poll_list`, `list_empty()`, `list_del_init()`, and
`spin_lock_bh()/spin_unlock_bh()` - all of which exist in any kernel
that has `mt76_reset_device()` (v6.17+).

---

## PHASE 4: MAILING LIST RESEARCH

### Step 4.1-4.5
Lore was behind Anubis anti-bot protection and could not be directly
fetched. However, the commit has a Link: to
`patch.msgid.link/20260120201043.38225-3-zac@zacbowling.com`, and b4 dig
confirmed the related series context. The patch was applied by the
subsystem maintainer (Felix Fietkau), which is the strongest possible
endorsement for mt76 patches.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1: Functions Modified
- `mt76_wcid_cleanup()` - the only function modified

### Step 5.2: Callers
`mt76_wcid_cleanup()` is called from:
1. `mt76_reset_device()` (mac80211.c:848) - the buggy path
2. `mt76_unregister_device()` (mac80211.c:807) - for global wcid
3. `mt76_sta_pre_rcu_remove()` (mac80211.c:1617) - normal station
   removal
4. Individual drivers: mt7996, mt7915, mt7925, mt792x, mt7615, mt7603 -
   in their sta_remove/bss_remove handlers

All the individual driver callers (items 4) already remove `poll_list`
BEFORE calling `mt76_wcid_cleanup()`. Only the `mt76_reset_device()`
path (item 1) was missing this cleanup.

### Step 5.3-5.5: Call Chain and Impact
`mt76_reset_device()` is called from:
- `mt7915_mac_full_reset()` - hardware restart path
- `mt7996` hardware restart path

This is triggered during hardware error recovery - a real, non-rare
event for WiFi users experiencing firmware crashes.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Buggy Code in Stable Trees
- `mt76_reset_device()` (the bug source) was introduced in
  `065c79df595af`, first in v6.17
- `mt76_wcid_add_poll()` (needed for the bug to manifest) in v6.14
- **Bug exists in**: v6.17, v6.18, v6.19, v7.0
- The surrounding code (`bdeac7815629c` offchannel cleanup) is also in
  v6.17+ so the context should match

### Step 6.2: Backport Complications
The fix should apply cleanly to v6.17+. The diff context lines
(idr_destroy, tx_list cleanup) have been stable since 2023.

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1: Subsystem
- **Subsystem**: WiFi driver (drivers/net/wireless/mediatek/mt76)
- **Criticality**: IMPORTANT - mt76 is one of the most popular open-
  source WiFi drivers, used in many routers (OpenWrt), embedded systems,
  and Linux laptops
- MediaTek WiFi chipsets (mt7915, mt7996, mt7921/mt7922) are extremely
  common

### Step 7.2: Activity Level
Very active - many commits in the v6.17-v7.0 window, actively maintained
by Felix Fietkau.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Users
All users of mt76-based WiFi hardware that experience hardware restarts
(firmware crashes). This is a broad category including:
- OpenWrt routers with mt7915/mt7996 chipsets
- Laptops with mt7921/mt7922 WiFi
- Any system using MediaTek WiFi that encounters a hardware error
  triggering restart

### Step 8.2: Trigger Conditions
Triggered during hardware restart/reset recovery - specifically when
`mt76_reset_device()` is called and then `mt76_wcid_add_poll()` is
called for a wcid that was on the old poll_list. Not extremely rare for
WiFi users since firmware crashes do happen.

### Step 8.3: Failure Mode Severity
- **List corruption** triggers kernel `BUG()`/`WARN()` from list debug
  code
- Without list debug: silent corruption can lead to kernel crashes, use-
  after-free, or infinite loops
- **Severity: CRITICAL** (kernel crash or corruption during WiFi
  recovery)

### Step 8.4: Risk-Benefit Ratio
- **Benefit**: HIGH - prevents kernel crashes during WiFi hardware
  restart, affects common hardware
- **Risk**: VERY LOW - 5 lines of code, uses established patterns,
  accepted by subsystem maintainer
- **Ratio**: Strongly favorable

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence Summary

**FOR backporting:**
- Fixes a real, demonstrable list corruption bug that causes kernel
  crashes
- Small, surgical fix (5 lines of code + comments)
- Follows the exact established pattern used by ALL other callers
- Applied by the subsystem maintainer (Felix Fietkau) who wrote the
  surrounding code
- Affects widely-used WiFi hardware (MediaTek mt76 chipsets)
- Bug triggers during hardware restart - a real-world scenario

**AGAINST backporting:**
- No Fixes: tag (but the fix implicitly targets `065c79df595af`)
- Author is not a regular contributor (but maintainer accepted it)
- Bug only exists in v6.17+ (limited to recent stable trees)

### Step 9.2: Stable Rules Checklist
1. **Obviously correct and tested?** YES - follows existing pattern,
   accepted by maintainer
2. **Fixes a real bug?** YES - list corruption during hardware restart
3. **Important issue?** YES - kernel BUG/crash severity
4. **Small and contained?** YES - 5 lines of code, single function
5. **No new features or APIs?** Correct - pure bug fix
6. **Can apply to stable?** YES - should apply cleanly to v6.17+

### Step 9.3: Exception Categories
Not applicable - this is a straightforward bug fix, not an exception
category.

### Step 9.4: Decision
Strong YES. This is a textbook stable backport candidate: small,
obviously correct, fixes a real kernel crash (list corruption), affects
popular hardware, accepted by the subsystem maintainer.

---

## Verification

- [Phase 1] Parsed tags: Link to patch.msgid.link, applied by Felix
  Fietkau (mt76 maintainer)
- [Phase 2] Diff analysis: +5 lines of code in `mt76_wcid_cleanup()`,
  adds `list_del_init(&wcid->poll_list)` under `sta_poll_lock`
- [Phase 2] Pattern match: Verified ALL other callers (mt7996, mt7915,
  mt792x, mt7615, mt7603) remove poll_list before calling
  mt76_wcid_cleanup - this fix makes the function self-contained
- [Phase 3] git blame: `mt76_wcid_cleanup()` introduced in
  0335c034e7265d (2023-08-29), `mt76_reset_device()` introduced in
  065c79df595af (2025-08-27)
- [Phase 3] git merge-base: `mt76_reset_device()` (065c79df595af) is in
  v6.17 but NOT v6.16, confirming bug exists in v6.17+
- [Phase 3] Related fix: 065c79df595af introduced `mt76_reset_device()`
  which paradoxically introduced this bug by calling
  `mt76_wcid_cleanup()` without poll_list cleanup then doing
  `INIT_LIST_HEAD(&dev->sta_poll_list)`
- [Phase 4] Lore: blocked by Anubis, but commit was applied by the
  subsystem maintainer which confirms review
- [Phase 5] Callers of `mt76_wcid_cleanup`: 7 call sites found;
  `mt76_reset_device()` is the only one that doesn't remove poll_list
  beforehand
- [Phase 5] `mt76_reset_device()` called from mt7915_mac_full_reset and
  mt7996 restart - real hardware restart paths
- [Phase 6] Bug exists in v6.17, v6.18, v6.19, v7.0 (confirmed with git
  merge-base)
- [Phase 6] Context code (offchannel bdeac7815629c) confirmed in v6.17+,
  so patch should apply cleanly
- [Phase 8] Failure mode: list_add corruption BUG/WARN → kernel crash,
  severity CRITICAL
- UNVERIFIED: Could not access lore discussion due to Anubis protection;
  maintainer sign-off is sufficient evidence of review

**YES**

 drivers/net/wireless/mediatek/mt76/mac80211.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c
index 75772979f438e..d0c522909e980 100644
--- a/drivers/net/wireless/mediatek/mt76/mac80211.c
+++ b/drivers/net/wireless/mediatek/mt76/mac80211.c
@@ -1716,6 +1716,16 @@ void mt76_wcid_cleanup(struct mt76_dev *dev, struct mt76_wcid *wcid)
 
 	idr_destroy(&wcid->pktid);
 
+	/* Remove from sta_poll_list to prevent list corruption after reset.
+	 * Without this, mt76_reset_device() reinitializes sta_poll_list but
+	 * leaves wcid->poll_list with stale pointers, causing list corruption
+	 * when mt76_wcid_add_poll() checks list_empty().
+	 */
+	spin_lock_bh(&dev->sta_poll_lock);
+	if (!list_empty(&wcid->poll_list))
+		list_del_init(&wcid->poll_list);
+	spin_unlock_bh(&dev->sta_poll_lock);
+
 	spin_lock_bh(&phy->tx_lock);
 
 	if (!list_empty(&wcid->tx_list))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] wifi: mt76: add missing lock protection in mt76_sta_state for sta_event callback
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (4 preceding siblings ...)
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: fix list corruption in mt76_wcid_cleanup Sasha Levin
@ 2026-04-20 13:18 ` Sasha Levin
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.1] Bluetooth: btmtk: improve mt79xx firmware setup retry flow Sasha Levin
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Ziyi Guo, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Ziyi Guo <n7l8m4@u.northwestern.edu>

[ Upstream commit f0168f2f9a1eca55d3ae09d8250b94e82b67cac3 ]

mt76_sta_state() calls the sta_event callback without holding dev->mutex.
However, mt7915_mac_sta_event() (MT7915 implementation of this callback)
calls mt7915_mac_twt_teardown_flow() which has
lockdep_assert_held(&dev->mt76.mutex) indicating that callers must
hold this lock.

The locking pattern in mt76_sta_state() is inconsistent:
- mt76_sta_add() acquires dev->mutex before calling dev->drv->sta_add
- mt76_sta_remove() acquires dev->mutex before calling __mt76_sta_remove
- But sta_event callback is called without acquiring the lock

Add mutex_lock()/mutex_unlock() around the mt7915_mac_twt_teardown_flow
invocation to fix the missing lock protection and maintain consistency
with the existing locking pattern.

Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Link: https://patch.msgid.link/20260131035210.2198259-1-n7l8m4@u.northwestern.edu
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now let me complete the systematic analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `wifi: mt76:` (MediaTek mt76 wireless driver)
- Action verb: "add missing" - indicates fixing a gap in existing code
- Summary: Adding lock protection around `sta_event` callback for TWT
  teardown

Record: [wifi: mt76 (mt7915)] [add missing lock] [Fix missing mutex
around mt7915_mac_twt_teardown_flow call in disassoc path]

**Step 1.2: Tags**
- Signed-off-by: Ziyi Guo (author)
- Link: patch.msgid.link (standard patch submission link)
- Signed-off-by: Felix Fietkau (mt76 maintainer who applied the patch)
- No Fixes: tag (expected for manual review)
- No Cc: stable (expected)
- No Reported-by (found via code analysis, not user report)

Record: Applied by Felix Fietkau (nbd@nbd.name) who is the mt76
subsystem maintainer.

**Step 1.3: Commit Body**
The commit clearly describes:
- Bug: `mt76_sta_state()` calls `sta_event` callback without holding
  `dev->mutex`
- `mt7915_mac_twt_teardown_flow()` has
  `lockdep_assert_held(&dev->mt76.mutex)` - proving the lock is required
- The inconsistency: `mt76_sta_add()` and `mt76_sta_remove()` correctly
  hold the lock, but `sta_event` does not
- Fix: Add `mutex_lock()`/`mutex_unlock()` around the specific call

Record: Bug is missing lock protection leading to lockdep warning and
potential data races in TWT teardown during disassociation. Root cause
is inconsistent locking introduced during refactoring.

**Step 1.4: Hidden Bug Fix?**
This is an explicit bug fix - "add missing lock protection" clearly
describes a synchronization issue. Not hidden.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- Single file changed:
  `drivers/net/wireless/mediatek/mt76/mt7915/main.c`
- +2 lines added (mutex_lock/mutex_unlock)
- Function modified: `mt7915_mac_sta_event()`
- Scope: Single-file, surgical, 2-line fix

**Step 2.2: Code Flow Change**
- Before: `mt7915_mac_twt_teardown_flow()` called in a loop without
  holding `dev->mt76.mutex`
- After: The loop is wrapped with `mutex_lock(&dev->mt76.mutex)` /
  `mutex_unlock(&dev->mt76.mutex)`
- Only the DISASSOC path is affected

**Step 2.3: Bug Mechanism**
This is a **synchronization/race condition fix**:
- `mt7915_mac_twt_teardown_flow()` modifies shared data: `flow->list`
  (via `list_del_init`), `msta->twt.flowid_mask`, `dev->twt.table_mask`,
  and `dev->twt.n_agrt`
- Without the mutex, concurrent TWT setup/teardown could corrupt linked
  lists and bitmasks
- The function itself explicitly requires the lock via
  `lockdep_assert_held()`

**Step 2.4: Fix Quality**
- Obviously correct: the function asserts the lock must be held, the fix
  provides it
- Minimal: 2 lines, wraps only the code that needs protection
- No regression risk: `mt7915_mcu_add_sta()` called after unlock doesn't
  require the lock (no lockdep_assert_held in it)
- The v2 design (adding lock in driver vs core) specifically avoids
  deadlock risk with other drivers

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
- The buggy code in `mt7915_mac_sta_event()` was introduced by commit
  `33eb14f1029085` (Felix Fietkau, 2024-08-27) "wifi: mt76: mt7915: use
  mac80211 .sta_state op"
- First appeared in v6.12-rc1; present in v6.12, v6.13, v6.14, v6.15,
  v6.16, v6.17, v6.18, v6.19, v7.0
- The `lockdep_assert_held` in `mt7915_mac_twt_teardown_flow()` has been
  there since commit `3782b69d03e714` (Lorenzo Bianconi, 2021-09-23) -
  since v5.16

**Step 3.2: Fixes Tag**
No Fixes: tag present (expected). However, the implicit fix target is
`33eb14f1029085` which exists in stable trees starting from v6.12.

**Step 3.3: File History**
Recent changes to the file are mostly unrelated. No prerequisites
needed.

**Step 3.4: Author**
Ziyi Guo is not the subsystem maintainer but the patch was accepted by
Felix Fietkau (the mt76 maintainer/author).

**Step 3.5: Dependencies**
The fix is standalone - it only adds mutex_lock/unlock calls around
existing code. No other patches needed.

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1:** b4 found the patch is v2. The v1 changelog note says:
"Move the locking to MT7915 driver to avoid deadlock in other drivers."
This demonstrates review feedback was incorporated. The patch was
applied by Felix Fietkau, the mt76 maintainer.

**Step 4.2:** Applied by subsystem maintainer Felix Fietkau.

**Step 4.3-4.5:** Lore was unreachable due to bot protection. However,
the patch metadata confirms it was properly reviewed and merged.

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1:** Modified function: `mt7915_mac_sta_event()`

**Step 5.2:** Called from `mt76_sta_state()` in `mac80211.c` (line
1671), which is the mac80211 `.sta_state` callback. This is called
during every station state transition (connect/disconnect).

**Step 5.3:** `mt7915_mac_twt_teardown_flow()` modifies:
- `flow->list` via `list_del_init()` - list corruption without lock
- `msta->twt.flowid_mask` - bitmask corruption
- `dev->twt.table_mask` - global device state
- `dev->twt.n_agrt` - global counter

**Step 5.4:** The path is: mac80211 sta_state callback -> mt76_sta_state
-> mt7915_mac_sta_event -> mt7915_mac_twt_teardown_flow. This is
triggered during every WiFi client disassociation on mt7915 hardware - a
common operation.

**Step 5.5:** The existing `mt7915_twt_teardown_request()` (line
1647-1658) correctly acquires the same mutex before calling the same
function, confirming the required locking pattern.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1:** The buggy code was introduced in v6.12 (commit
33eb14f1029085). It affects stable trees v6.12.y and later.

**Step 6.2:** The patch is a trivial 2-line addition that should apply
cleanly. The code around it has not changed significantly.

**Step 6.3:** No other fix for this issue found.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1:** WiFi driver (drivers/net/wireless/mediatek/mt76/mt7915/).
Criticality: IMPORTANT - mt7915 is a widely used WiFi chipset (used in
many routers and access points).

**Step 7.2:** The mt76 subsystem is actively developed.

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1:** Affected users: All users of mt7915 WiFi hardware (common
in routers and access points).

**Step 8.2:** Trigger: Every WiFi client disassociation with active TWT
flows. Any WiFi 6 client using TWT can trigger this. With CONFIG_LOCKDEP
enabled, it produces a lockdep warning/splat. Without lockdep, the race
window exists and could corrupt shared data structures.

**Step 8.3:** Failure modes:
- With LOCKDEP: lockdep assertion failure/warning on every
  disassociation
- Without LOCKDEP: Race condition on `list_del_init`, bitmask/counter
  manipulation - could cause list corruption (crash), incorrect TWT
  state, or data corruption
- Severity: HIGH (list corruption can crash, lockdep warning disrupts
  debug/testing)

**Step 8.4:** Risk-benefit:
- Benefit: HIGH - prevents lockdep warnings and potential data
  corruption/crash
- Risk: VERY LOW - 2-line addition, obviously correct, wraps only the
  necessary code
- The v2 design specifically avoids deadlock in other drivers

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence**
FOR backporting:
- Fixes a real locking bug: `lockdep_assert_held()` proves the lock is
  required
- Prevents list corruption and data races on shared data structures
- 2-line fix, obviously correct, minimal risk
- Applied by subsystem maintainer (Felix Fietkau)
- v2 incorporated review feedback to avoid deadlock
- Consistent with existing locking patterns in the same driver
- Affects widely-used WiFi hardware (mt7915)

AGAINST backporting:
- No explicit user report/crash report
- Bug only affects WiFi 6 TWT flow teardown (somewhat specialized path)
- Code only exists since v6.12

**Step 9.2: Stable Rules Checklist**
1. Obviously correct and tested? YES - lockdep_assert_held proves
   requirement, fix is trivial
2. Fixes a real bug? YES - missing lock on shared data structures
3. Important issue? YES - list corruption, data races, lockdep warnings
4. Small and contained? YES - 2 lines, single file
5. No new features/APIs? YES - pure bug fix
6. Can apply to stable? YES - clean apply expected for v6.12+

**Step 9.3: Exception Categories**: N/A - this is a standard bug fix.

**Step 9.4: Decision**: YES - this is a small, surgical fix for a real
locking bug that could cause data corruption or crashes on widely-used
WiFi hardware.

## Verification

- [Phase 1] Parsed tags: Signed-off-by Felix Fietkau (maintainer), Link
  to patch.msgid.link
- [Phase 2] Diff analysis: +2 lines in mt7915/main.c, adds
  mutex_lock/unlock around twt_teardown_flow loop
- [Phase 3] git blame: Buggy code introduced by 33eb14f1029085 (Felix
  Fietkau, 2024-08-27, first in v6.12-rc1)
- [Phase 3] git show 33eb14f1029085: Confirmed refactoring that
  introduced the missing lock path
- [Phase 3] lockdep_assert_held confirmed at mac.c:2385 in
  mt7915_mac_twt_teardown_flow()
- [Phase 3] Existing correct locking pattern verified at
  main.c:1655-1657 (mt7915_twt_teardown_request)
- [Phase 4] b4 am: Retrieved v2 patch, confirmed changelog "Move the
  locking to MT7915 driver to avoid deadlock in other drivers"
- [Phase 4] lore.kernel.org was unreachable (bot protection) - could not
  read full thread discussion
- [Phase 5] mt76_sta_state at mac80211.c:1671 calls sta_event without
  lock (verified by reading code)
- [Phase 5] mt76_sta_add at mac80211.c:1599 and mt76_sta_remove at
  mac80211.c:1630-1632 DO hold the lock (verified)
- [Phase 5] mt7915_mac_twt_teardown_flow modifies shared: list_del_init,
  flowid_mask, table_mask, n_agrt (verified at mac.c:2398-2401)
- [Phase 6] Bug exists in v6.12 through v7.0 (verified via git tag
  --contains)
- [Phase 6] Code does NOT exist before v6.12 (verified: no
  sta_state/sta_event changes in v6.6..v6.11)
- [Phase 8] Trigger: WiFi client disassociation with TWT flows on mt7915
  hardware
- UNVERIFIED: Could not access lore.kernel.org for full review
  discussion due to bot protection

**YES**

 drivers/net/wireless/mediatek/mt76/mt7915/main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/main.c b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
index 90d5e79fbf74d..e212e964fda03 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
@@ -851,8 +851,10 @@ int mt7915_mac_sta_event(struct mt76_dev *mdev, struct ieee80211_vif *vif,
 		return mt7915_mcu_add_sta(dev, vif, sta, CONN_STATE_PORT_SECURE, false);
 
 	case MT76_STA_EVENT_DISASSOC:
+		mutex_lock(&dev->mt76.mutex);
 		for (i = 0; i < ARRAY_SIZE(msta->twt.flow); i++)
 			mt7915_mac_twt_teardown_flow(dev, msta, i);
+		mutex_unlock(&dev->mt76.mutex);
 
 		mt7915_mcu_add_sta(dev, vif, sta, CONN_STATE_DISCONNECT, false);
 		msta->wcid.sta_disabled = 1;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.1] Bluetooth: btmtk: improve mt79xx firmware setup retry flow
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (5 preceding siblings ...)
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: add missing lock protection in mt76_sta_state for sta_event callback Sasha Levin
@ 2026-04-20 13:18 ` Sasha Levin
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: Disable Rx hdr_trans in monitor mode Sasha Levin
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:18 UTC (permalink / raw)
  To: patches, stable
  Cc: Chris Lu, Luiz Augusto von Dentz, Sasha Levin, marcel, luiz.dentz,
	matthias.bgg, angelogioacchino.delregno, linux-bluetooth,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Chris Lu <chris.lu@mediatek.com>

[ Upstream commit 54f1f020e9f4a087779cc4d96a7c86f47d0c6797 ]

If retries are exhausted, driver should not do futher operation.
During mt79xx firmware download process, if the retry count reaches0,
driver will return an -EIO error and release the firmware resources.

Signed-off-by: Chris Lu <chris.lu@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for a thorough analysis. Let me
compile the full assessment.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
Record: [bluetooth/btmtk] [improve] Fix retry exhaustion handling in
mt79xx firmware setup - when retry count reaches 0, driver should return
error instead of continuing.

**Step 1.2: Tags**
- Signed-off-by: Chris Lu <chris.lu@mediatek.com> (author, regular
  MediaTek Bluetooth contributor)
- Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
  (Bluetooth maintainer, committed the patch)
- No Fixes: tag, no Reported-by, no Cc: stable - expected for manual
  review candidates.

**Step 1.3: Commit Body**
The body describes: "If retries are exhausted, driver should not do
further operation. During mt79xx firmware download process, if the retry
count reaches 0, driver will return an -EIO error and release the
firmware resources." Clear description of a missing error check.

**Step 1.4: Hidden Bug Fix Detection**
Despite the word "improve" in the subject, this IS a bug fix. The word
"improve" masks a clear logic error: the retry loop can exhaust without
any error return, causing the driver to silently proceed with firmware
download on a device in an abnormal state.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- 1 file changed: `drivers/bluetooth/btmtk.c`
- +6 lines added (1 comment, 5 code lines)
- Function modified: `btmtk_setup_firmware_79xx()`
- Scope: single-file surgical fix

**Step 2.2: Code Flow Change**
The retry loop at line 171 (`while (retry > 0)`) handles
`BTMTK_WMT_PATCH_PROGRESS` by decrementing `retry`. BEFORE: if retry
hits 0, the loop exits normally and code falls through to `fw_ptr +=
section_offset`, proceeding with firmware download. AFTER: a check for
`retry == 0` returns `-EIO` and jumps to `err_release_fw`.

**Step 2.3: Bug Mechanism**
This is a **logic/correctness fix** - missing error check after retry
exhaustion. The `while (retry > 0)` loop can exit via:
1. `break` when status == `BTMTK_WMT_PATCH_UNDONE` (normal path -
   proceed to download)
2. `goto next_section` when status == `BTMTK_WMT_PATCH_DONE` (skip
   section)
3. `goto err_release_fw` on command error or unexpected status
4. Loop exhaustion when retry reaches 0 (BUG: falls through to download
   path)

Case 4 is the bug - the code proceeds as if the device is ready when
it's not.

**Step 2.4: Fix Quality**
Obviously correct. The check `if (retry == 0)` can only be true if the
loop exhausted, meaning the device never left `PATCH_PROGRESS` state.
Returning `-EIO` and cleaning up is the correct behavior. No regression
risk.

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
The buggy code was introduced in commit `8c0d17b6b06c5b` "Bluetooth:
mediatek: add BT_MTK module" by Sean Wang on 2021-10-19. This was the
initial creation of the BT_MTK module.

**Step 3.2: No Fixes: tag** - expected.

**Step 3.3: File History**
The function `btmtk_setup_firmware_79xx` has been stable since 2021 in
the retry loop area. The surrounding code has only had one minor change
(commit `995d948cf2e458` adding `err = -EIO` in the else branch).

**Step 3.4: Author**
Chris Lu is a regular MediaTek Bluetooth contributor with 28+ commits
touching `drivers/bluetooth/`, including many device ID additions and
critical fixes.

**Step 3.5: Dependencies**
This commit is patch 1/3 of a series, but it is **standalone**. Patches
2/3 and 3/3 add additional improvements (status checking and reset
mechanism) that build on this but are not required. The fix applies
cleanly without dependencies.

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1: Original Discussion**
Found via `b4 dig -c 54f1f020e9f4`: Submitted as `[PATCH v1 1/3]` on
2026-02-03. The cover letter explains: "When the device unexpectedly
restarts during previous firmware download process, it can cause mt79xx
firmware status to be abnormal in the next attempt." Series applied to
bluetooth-next by Luiz Augusto von Dentz on 2026-04-10.

**Step 4.2: Review**
Only v1 was submitted (no revisions needed). The Bluetooth maintainer
(Luiz Augusto von Dentz) applied the series directly, indicating
confidence in the fix quality.

**Step 4.3: Bug Report**
No specific bug report link. The cover letter describes a real-world
scenario where the device unexpectedly restarts during firmware
download.

**Step 4.4: Series Context**
Part of 3-patch series, but this patch is standalone. Patches 2 and 3
are independent improvements that enhance the error recovery further.

**Step 4.5: Stable Discussion**
No existing stable nomination or discussion found.

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Key Function**
`btmtk_setup_firmware_79xx()` - firmware setup for MT79xx series.

**Step 5.2: Callers**
- `btmtk_usb_setup()` in `btmtk.c` line 1332 - USB path for MT7922,
  MT7925, MT7961
- `mt79xx_setup()` in `btmtksdio.c` line 873 - SDIO path

Both are called during device initialization/setup.

**Step 5.3-5.4: Reachability**
Called during HCI device setup, triggered when a MT79xx Bluetooth device
is initialized. This is a common code path for all MT792x Bluetooth
device users.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code exists in stable**
The buggy code was introduced in October 2021 (commit `8c0d17b6b06c5b`).
Tags show it's in p-6.1, p-6.6, and all newer stable trees. The bug
affects ALL active stable trees.

**Step 6.2: Backport Complexity**
The patch should apply cleanly - the retry loop code hasn't changed
since the original 2021 commit.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1: Subsystem**
Bluetooth driver (drivers/bluetooth/) - IMPORTANT criticality.
MT7921/MT7922/MT7925 are extremely popular WiFi/BT combo chips found in
many laptops (Lenovo, ASUS, Dell, etc.).

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected Users**
Users of MediaTek MT7921, MT7922, MT7925 Bluetooth devices (very large
population).

**Step 8.2: Trigger Conditions**
Triggered when the device reports `BTMTK_WMT_PATCH_PROGRESS`
continuously for 2+ seconds during firmware download. The cover letter
describes this happening after an unexpected device restart during a
previous firmware download attempt.

**Step 8.3: Failure Mode**
Without fix: firmware download proceeds on a device in an abnormal
state, potentially leading to device malfunction, failed bluetooth
initialization, or undefined behavior. Severity: MEDIUM-HIGH.

**Step 8.4: Risk-Benefit Ratio**
- BENEFIT: Prevents firmware download to a device in an abnormal state
  for widely-used hardware
- RISK: Extremely low - 5 lines, obviously correct, only affects an
  already-failed case
- Ratio: Strongly favorable

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes a real logic error (missing error check after retry exhaustion)
- Very small and surgical (5 lines of code)
- Obviously correct
- Affects widely-used hardware (MT792x family)
- Standalone fix with no dependencies
- Applied by subsystem maintainer without revisions needed
- Buggy code exists in all active stable trees since 2021
- Clean apply expected

**Evidence AGAINST backporting:**
- Not a crash/panic fix - the immediate impact is proceeding with
  firmware download in an abnormal state
- No specific user-reported bug linked
- Commit message uses "improve" rather than "fix"

**Stable Rules Checklist:**
1. Obviously correct and tested? YES - trivially verifiable, applied to
   bluetooth-next
2. Fixes a real bug? YES - missing error handling after retry exhaustion
3. Important issue? MEDIUM-HIGH - prevents undefined device behavior
   during firmware setup for popular hardware
4. Small and contained? YES - 5 lines, one file, one function
5. No new features? CORRECT - pure error handling fix
6. Can apply to stable? YES - clean apply expected

## Verification

- [Phase 1] Parsed tags: Signed-off-by from Chris Lu (author) and Luiz
  Augusto von Dentz (maintainer/committer)
- [Phase 2] Diff analysis: 5 lines added after retry loop in
  `btmtk_setup_firmware_79xx()`, adds `retry == 0` check returning -EIO
- [Phase 3] git blame: buggy code introduced in commit 8c0d17b6b06c5b
  (2021-10-19, Sean Wang), present in all stable trees
- [Phase 3] git tag --contains: confirmed present in p-6.1, p-6.6, and
  all newer stable tags
- [Phase 3] git log --author="Chris Lu": confirmed Chris Lu is a regular
  MediaTek BT contributor with 28+ commits
- [Phase 4] b4 dig -c 54f1f020e9f4: found original submission at
  https://patch.msgid.link/20260203062510.848761-2-chris.lu@mediatek.com
- [Phase 4] b4 dig -a: only v1, no revisions needed, applied directly by
  maintainer
- [Phase 4] Thread mbox: series was applied to bluetooth-next,
  patchwork-bot confirmed
- [Phase 4] Cover letter: describes real-world scenario (device restart
  during firmware download)
- [Phase 5] Grep callers: btmtk_usb_setup() (USB path) and
  mt79xx_setup() (SDIO path) both call this function
- [Phase 6] Code unchanged in retry loop since 2021 - clean apply
  expected
- [Phase 8] Failure mode: proceeds with firmware download on abnormal
  device state, severity MEDIUM-HIGH

**YES**

 drivers/bluetooth/btmtk.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c
index fa7533578f85c..0ada5a12130dc 100644
--- a/drivers/bluetooth/btmtk.c
+++ b/drivers/bluetooth/btmtk.c
@@ -205,6 +205,12 @@ int btmtk_setup_firmware_79xx(struct hci_dev *hdev, const char *fwname,
 				}
 			}
 
+			/* If retry exhausted goto err_release_fw */
+			if (retry == 0) {
+				err = -EIO;
+				goto err_release_fw;
+			}
+
 			fw_ptr += section_offset;
 			wmt_params.op = BTMTK_WMT_PATCH_DWNLD;
 			wmt_params.status = NULL;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: Disable Rx hdr_trans in monitor mode
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (6 preceding siblings ...)
  2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.1] Bluetooth: btmtk: improve mt79xx firmware setup retry flow Sasha Levin
@ 2026-04-20 13:19 ` Sasha Levin
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: Skip scan process during suspend Sasha Levin
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:19 UTC (permalink / raw)
  To: patches, stable
  Cc: Ryder Lee, Felix Fietkau, Sasha Levin, lorenzo, matthias.bgg,
	angelogioacchino.delregno, linux-wireless, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: Ryder Lee <ryder.lee@mediatek.com>

[ Upstream commit 947d63d8cd3b03c7be16875ca90273edbdbe7ce5 ]

Ensure raw frames are captured without header modification.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Link: https://patch.msgid.link/04008426d6cd5de3995beefb98f9d13f35526c25.1770969275.git.ryder.lee@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Good. Now I have all the information needed. Let me compile my analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `wifi: mt76: mt7996:` (MediaTek MT7996 WiFi driver)
- Action verb: "Disable" — implies correcting a misconfiguration
- Summary: Disable Rx header translation when monitor mode is active

Record: [wifi: mt76: mt7996] [Disable] [Disable Rx header translation in
monitor mode to ensure raw frames]

**Step 1.2: Tags**
- `Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>` — MediaTek
  employee, regular mt76 contributor
- `Link:` to patch.msgid.link (submission URL)
- `Signed-off-by: Felix Fietkau <nbd@nbd.name>` — mt76 subsystem
  maintainer who merged it
- No Fixes: tag, no Reported-by:, no Cc: stable

Record: Author is a MediaTek contributor. Maintainer Felix Fietkau
merged it.

**Step 1.3: Commit Body**
"Ensure raw frames are captured without header modification." This is
terse but clearly states: without this fix, monitor mode frames are
modified (translated from 802.11 to Ethernet format), which makes
captured frames incorrect/useless.

Record: [Bug: monitor mode captures frames with modified (translated)
headers instead of raw 802.11 frames] [Symptom: packet capture tools see
Ethernet headers instead of 802.11 headers] [Root cause: RX header
translation not disabled when entering monitor mode]

**Step 1.4: Hidden Bug Fix Detection**
This IS a bug fix. "Ensure raw frames are captured" means they currently
are NOT captured correctly. Monitor mode is broken without this fix — it
produces unusable output.

Record: [Yes, this is a clear bug fix. Monitor mode produces incorrectly
formatted frames.]

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- `mt7996/regs.h`: +3 lines (register and bit definitions)
- `mt7996/main.c`: +2 lines (register write to disable/enable hdr_trans)
- Total: +5 lines, 0 removed
- Functions modified: `mt7996_set_monitor()` only
- Scope: Single-file surgical fix (+ supporting register defines)

Record: [2 files, +5 lines, 0 removed] [mt7996_set_monitor()] [Single-
function surgical fix]

**Step 2.2: Code Flow**
Before: `mt7996_set_monitor()` sets `MT_DMA_DCR0_RXD_G5_EN`, updates rx
filter, and sets sniffer mode — but does NOT disable hardware header
translation.
After: Additionally toggles `MT_MDP_DCR0_RX_HDR_TRANS_EN` — disabling it
when monitor=enabled, enabling it when monitor=disabled.

Record: [Before: hdr_trans stays enabled in monitor mode → corrupted
captures. After: hdr_trans properly toggled with monitor mode]

**Step 2.3: Bug Mechanism**
Category: (g) Logic/correctness fix — missing hardware configuration
step.
The hardware's RX header translation converts 802.11 frame headers to
Ethernet headers. In monitor mode, raw 802.11 frames must be captured
unmodified. Not disabling this translation makes monitor mode output
incorrect.

Record: [Missing hardware configuration] [hdr_trans not toggled →
monitor mode frames have wrong headers]

**Step 2.4: Fix Quality**
- Obviously correct: The mt7915 sibling driver does the exact same thing
  (verified at `mt7915/main.c:496`)
- Minimal/surgical: 2 lines of functional code + 3 register defs
- Regression risk: Very low — only affects monitor mode path, standard
  register toggle
- No red flags

Record: [Obviously correct, mirrors mt7915. Minimal. Very low regression
risk.]

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
The `mt7996_set_monitor()` function was introduced by commit
`69d54ce7491d04` ("wifi: mt76: mt7996: switch to single multi-radio
wiphy") by Felix Fietkau, first appearing in v6.14-rc1. Before v6.14,
monitor mode was handled inline in `mt7996_config()` — also missing
hdr_trans disable.

Record: [Buggy code introduced in 69d54ce7491d04, v6.14. Older code
(v6.12 and before) also lacked this but had different code structure.]

**Step 3.2: Fixes tag**
No Fixes: tag present (expected).

**Step 3.3: File History**
Recent changes to main.c show numerous MLO/MLD fixes. The
`cb423ddad0f6e` commit fixed a NULL deref in the same
`mt7996_set_monitor()` function (moved `dev = phy->dev` after the NULL
check). This prerequisite is already in the current tree.

Record: [cb423ddad0f6e is a prerequisite that's already applied. No
other dependencies found.]

**Step 3.4: Author**
Ryder Lee is a regular MediaTek contributor to mt76 with multiple
accepted patches.

Record: [Regular MediaTek contributor to the subsystem]

**Step 3.5: Dependencies**
The patch adds `MT_MDP_DCR0` and `MT_MDP_DCR0_RX_HDR_TRANS_EN` register
definitions and uses them. Self-contained — no external dependencies
beyond the function already existing.

The function `mt7996_set_monitor()` only exists from v6.14+. For v6.14.y
backport, the NULL deref fix `cb423ddad0f6e` would need to be present
first (or the patch adapted to the pre-fix code).

Record: [Self-contained. Applies to v6.14+ where mt7996_set_monitor()
exists.]

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1-4.5:**
Lore was not accessible due to anti-bot protection. The Link: in the
commit points to `patch.msgid.link/04008426d6cd5de3995beefb98f9d13f35526
c25.1770969275.git.ryder.lee@mediatek.com`. B4 dig did not find the
commit (likely not in the local repo under that hash).

Record: [UNVERIFIED: Could not access lore or b4 dig results. However,
Felix Fietkau (mt76 maintainer) signed off on the merge, confirming
maintainer review.]

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Functions modified**
`mt7996_set_monitor()` — a static function in main.c.

**Step 5.2: Callers**
`mt7996_set_monitor()` is called from:
- `mt7996_add_interface()` when `vif->type == NL80211_IFTYPE_MONITOR`
  (line 501)
- `mt7996_remove_interface()` when monitor mask changes (line 547)

These are standard mac80211 callbacks triggered when a user adds/removes
a monitor interface (e.g., `iw dev wlan0 set type monitor`).

Record: [Called from mac80211 interface add/remove — standard user-
triggered path]

**Step 5.3: What it calls**
`mt76_rmw_field()` — standard register read-modify-write. This is a
well-tested primitive.

**Step 5.4: Reachability**
User creates a monitor interface → mac80211 → `mt7996_add_interface()` →
`mt7996_set_monitor()`. Fully reachable from userspace.

Record: [Reachable via standard WiFi monitor mode interface creation]

**Step 5.5: Similar patterns**
The mt7915 driver has the exact same pattern at `mt7915/main.c:496`:

```494:495:drivers/net/wireless/mediatek/mt76/mt7915/main.c
                mt76_rmw_field(dev, MT_DMA_DCR0(band),
MT_MDP_DCR0_RX_HDR_TRANS_EN,
                               !dev->monitor_mask);
```

This confirms the fix is correct and needed — the mt7996 was simply
missing this step.

Record: [mt7915 already has this exact pattern. mt7996 was missing it.]

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Does buggy code exist in stable trees?**
- `mt7996_set_monitor()` was introduced in v6.14 (commit
  `69d54ce7491d04`)
- Does NOT exist in v6.12 or v6.13 (confirmed via `git show v6.12:...`
  and `git show v6.13:...`)
- The older monitor code path (in `mt7996_config()`) also lacked
  hdr_trans disable, but has different structure
- Applicable stable trees: v6.14.y and later (v6.14 has active stable
  releases through v6.14.11)

Record: [Buggy code exists in 6.14.y. Older trees have different code
structure with same bug.]

**Step 6.2: Backport complications**
- For 6.14.y: The `dev` initialization is before the NULL check
  (pre-`cb423ddad0f6e`), but the patch insertion point is identical.
  Minor context difference but patch should apply or need trivial
  adjustment.
- `MT_MDP_DCR0` register definitions don't exist in 6.14.y's regs.h
  (confirmed), so the register defs must come with the patch (they do).

Record: [6.14.y: Near-clean apply, minor context difference from NULL
deref fix]

**Step 6.3: Related fixes already in stable**
No evidence of a different fix for this same issue in any stable tree.

Record: [No prior fix found]

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

**Step 7.1:** WiFi driver (mt76/mt7996) — IMPORTANT subsystem. MT7996 is
MediaTek's WiFi 7 chipset used in access points and routers.

**Step 7.2:** Very active subsystem — 73 changes between v6.14 and v7.0
in this single file.

Record: [IMPORTANT subsystem, very active development]

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected users**
All users of MT7996/MT7992 WiFi hardware who use monitor mode for packet
capture or WiFi analysis.

**Step 8.2: Trigger conditions**
Trigger: Enable monitor mode on an mt7996 device. Every user of monitor
mode is affected. Common operation for network administrators, WiFi
developers, and security researchers.

**Step 8.3: Failure mode severity**
Not a crash — but monitor mode produces incorrect/corrupted frame
captures. The feature is essentially non-functional. Severity: MEDIUM-
HIGH (complete functional failure of a core WiFi feature).

**Step 8.4: Risk-benefit**
- BENEFIT: HIGH — fixes completely broken monitor mode on mt7996
  hardware
- RISK: VERY LOW — 2 lines toggling a hardware register bit, same
  pattern as mt7915, only affects monitor mode code path
- Ratio: Very favorable

Record: [Benefit HIGH, Risk VERY LOW. 5-line patch fixing broken monitor
mode.]

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence compilation**

FOR backporting:
- Fixes a real functional bug: monitor mode captures are corrupted
- Small, surgical fix: +5 lines total (2 functional + 3 register defs)
- Obviously correct: mirrors mt7915 sibling driver behavior
- Maintainer-reviewed: Felix Fietkau signed off
- Author is a MediaTek contributor who knows the hardware
- Low regression risk: only affects monitor mode path
- Reachable from userspace via standard WiFi operations

AGAINST backporting:
- No crash/security/data corruption — "just" broken monitor mode output
- No Fixes: tag, no Reported-by: (no evidence of user complaints)
- Only applies to v6.14+ (limited stable tree scope)
- Terse commit message doesn't call this a "fix"

**Step 9.2: Stable rules checklist**
1. Obviously correct and tested? YES — mirrors mt7915, accepted by
   maintainer
2. Fixes a real bug? YES — monitor mode doesn't work correctly
3. Important issue? MEDIUM-HIGH — complete functional failure of monitor
   mode
4. Small and contained? YES — 5 lines, 1 function
5. No new features/APIs? CORRECT — no new features
6. Can apply to stable? YES for v6.14.y (with possible minor context
   adaptation)

**Step 9.3: Exception categories**
This is a hardware configuration bug fix, not an exception category.

**Step 9.4: Decision**
This is a small, obviously correct bug fix that restores monitor mode
functionality on MT7996 hardware. The fix mirrors what the mt7915 driver
already does, is reviewed by the subsystem maintainer, and has very low
regression risk. While it doesn't fix a crash, it fixes complete
functional breakage of a core WiFi feature (monitor mode).

**Verification:**
- [Phase 1] Parsed tags: Signed-off-by from MediaTek employee and mt76
  maintainer. No Fixes/Reported-by.
- [Phase 2] Diff analysis: +2 lines in mt7996_set_monitor() toggling
  MT_MDP_DCR0_RX_HDR_TRANS_EN, +3 lines register definitions in regs.h
- [Phase 3] git blame: mt7996_set_monitor() introduced in 69d54ce7491d04
  (v6.14-rc1)
- [Phase 3] Verified mt7996_set_monitor() does NOT exist in v6.12 or
  v6.13 (git show confirmed)
- [Phase 3] cb423ddad0f6e is a prerequisite NULL deref fix already in
  tree
- [Phase 4] UNVERIFIED: Lore/b4 not accessible due to anti-bot
  protection
- [Phase 5] grep confirmed mt7915/main.c:496 has identical hdr_trans
  disable pattern
- [Phase 5] Callers: mt7996_add_interface() and
  mt7996_remove_interface() — standard mac80211 callbacks
- [Phase 6] v6.14 regs.h confirmed: MT_MDP_DCR0 not defined, MT_MDP_DCR2
  exists at line 136
- [Phase 6] v6.14.y is the earliest applicable stable tree
- [Phase 7] Active subsystem: 73 commits between v6.14 and v7.0 in
  main.c
- [Phase 8] Failure mode: monitor mode captures have translated (wrong)
  headers — feature non-functional

**YES**

 drivers/net/wireless/mediatek/mt76/mt7996/main.c | 2 ++
 drivers/net/wireless/mediatek/mt76/mt7996/regs.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/main.c b/drivers/net/wireless/mediatek/mt76/mt7996/main.c
index f16135f0b7f94..110b9a32e5ec8 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/main.c
@@ -472,6 +472,8 @@ static void mt7996_set_monitor(struct mt7996_phy *phy, bool enabled)
 
 	mt76_rmw_field(dev, MT_DMA_DCR0(phy->mt76->band_idx),
 		       MT_DMA_DCR0_RXD_G5_EN, enabled);
+	mt76_rmw_field(dev, MT_MDP_DCR0,
+		       MT_MDP_DCR0_RX_HDR_TRANS_EN, !enabled);
 	mt7996_phy_set_rxfilter(phy);
 	mt7996_mcu_set_sniffer_mode(phy, enabled);
 }
diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/regs.h b/drivers/net/wireless/mediatek/mt76/mt7996/regs.h
index e48e0e575b646..393faae2d52b6 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/regs.h
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/regs.h
@@ -159,6 +159,9 @@ enum offs_rev {
 #define MT_MDP_BASE				0x820cc000
 #define MT_MDP(ofs)				(MT_MDP_BASE + (ofs))
 
+#define MT_MDP_DCR0				MT_MDP(0x800)
+#define MT_MDP_DCR0_RX_HDR_TRANS_EN		BIT(19)
+
 #define MT_MDP_DCR2				MT_MDP(0x8e8)
 #define MT_MDP_DCR2_RX_TRANS_SHORT		BIT(2)
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: Skip scan process during suspend.
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (7 preceding siblings ...)
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: Disable Rx hdr_trans in monitor mode Sasha Levin
@ 2026-04-20 13:19 ` Sasha Levin
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: mt76: mt76x02: wake queues after reconfig Sasha Levin
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:19 UTC (permalink / raw)
  To: patches, stable
  Cc: Michael Lo, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Michael Lo <michael.lo@mediatek.com>

[ Upstream commit 8c7e19612b01567f641d3ffe21e47fa21c331171 ]

We are experiencing command timeouts because an upper layer triggers
an unexpected scan while the system/device is in suspend.
The upper layer should not initiate scans until the NIC has fully resumed.
We want to prevent scans during suspend and avoid timeouts without harming
power management or user experience.

Signed-off-by: Michael Lo <michael.lo@mediatek.com>
Link: https://patch.msgid.link/20260112114007.2115873-1-leon.yen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a comprehensive understanding. Let me compile my analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `wifi: mt76: mt7925:`
- Action: "Skip" — implies avoiding an undesired operation during a
  specific state
- Summary: Skip scan processing during suspend to prevent command
  timeouts
- Record: [wifi/mt76/mt7925] [skip/prevent] [avoid scan work execution
  during suspend to prevent command timeouts]

**Step 1.2: Tags**
- `Signed-off-by: Michael Lo <michael.lo@mediatek.com>` — MediaTek
  employee, driver vendor
- `Link: https://patch.msgid.link/20260112114007.2115873-1-
  leon.yen@mediatek.com` — original submission by Leon Yen
- `Signed-off-by: Felix Fietkau <nbd@nbd.name>` — mt76 subsystem
  maintainer, applied the patch
- No Fixes: tag (expected for autosel candidates)
- No Reported-by: tag
- Record: Author from MediaTek (driver vendor), applied by subsystem
  maintainer Felix Fietkau.

**Step 1.3: Commit Body**
- Bug: Command timeouts during suspend because upper layer triggers a
  scan while device is suspended
- Symptom: Command timeouts
- Root cause: Scan work runs when device is in suspended state and can't
  respond to firmware commands
- Record: [Bug: command timeout during suspend from unexpected scan]
  [Symptom: timeout errors] [Root cause: scan work executing while
  device is powered down]

**Step 1.4: Hidden Bug Fix Detection**
- Despite using "Skip" rather than "fix," this directly addresses a
  command timeout — a real functional bug. Users would experience
  suspend failures or WiFi errors after resume.
- Record: Yes, this is a bug fix. The "skip" phrasing masks a fix for
  command timeouts during suspend.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- 1 file modified: `drivers/net/wireless/mediatek/mt76/mt7925/main.c`
- +8 lines added (3 variable declarations, 5 lines for the pm->suspended
  check)
- Function modified: `mt7925_scan_work()`
- Scope: Single-file, single-function surgical fix
- Record: [1 file, +8 lines, mt7925_scan_work(), surgical fix]

**Step 2.2: Code Flow Change**
- BEFORE: `mt7925_scan_work()` always processes queued scan events from
  `scan_event_list`, regardless of device power state
- AFTER: `mt7925_scan_work()` first checks `pm->suspended`; if true,
  returns immediately without processing events
- This affects the scan event processing path during suspend
- Record: [Before: always processes scan events; After: skips processing
  if device is suspended]

**Step 2.3: Bug Mechanism**
- Category: Logic/correctness fix + timing issue
- The race: `mt7925_suspend()` cancels scan_work at line 1476, but
  `mt7925_mcu_scan_event()` can re-queue scan_work after cancellation.
  Between `cancel_delayed_work_sync()` and full device suspension, the
  MCU can still generate scan events, re-queuing scan_work. When
  scan_work runs against the suspended device, firmware commands time
  out.
- The `pm->suspended` flag is set in PCI/USB suspend
  (`mt7925_pci_suspend()` line 452) and cleared in resume
  (`_mt7925_pci_resume()` line 590)
- Record: [Logic/timing fix: scan work can run against suspended device
  causing command timeouts]

**Step 2.4: Fix Quality**
- Obviously correct: checks a well-established flag (`pm->suspended`)
  that is used consistently throughout the mt76 driver family
- Minimal and surgical: adds only an early return
- Minor concern: skbs in `scan_event_list` are not freed on early
  return, but they would be processed on resume or cleaned up on device
  removal
- Pattern is consistent with other uses of `pm->suspended` in the driver
  (e.g., `mt792x_mac.c:278`, `mt76_connac_mac.c:47,73`,
  `mt7925/regd.c:196`)
- Record: [Fix is obviously correct, minimal, follows established driver
  patterns. Minor skb leak concern is acceptable.]

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
- The `mt7925_scan_work()` function was introduced in commit
  `c948b5da6bbec` by Deren Wu on 2023-09-18, the initial commit adding
  the mt7925 driver
- This is the initial code — the bug has existed since the driver was
  created
- Record: [Buggy code from c948b5da6bbec (initial mt7925 driver, v6.7)]

**Step 3.2: Fixes Tag**
- No Fixes: tag present (expected for autosel candidate)
- Record: N/A

**Step 3.3: File History**
- Multiple related suspend/resume fixes for mt7925 exist:
  `bf39813599b03` (simplify HIF suspend), `2d5630b0c9466` (fix low power
  mode entry), `1b97fc8443aea` (fix regd_notifier before suspend)
- Related scan fix: `122f270aca2c8` (prevent multiple scan commands)
- No prerequisites identified for this specific fix
- Record: [Multiple suspend-related fixes indicate ongoing suspend
  reliability improvements. Fix is standalone.]

**Step 3.4: Author**
- Michael Lo / Leon Yen are MediaTek employees and regular mt76
  contributors
- Multiple suspend/resume and scan-related fixes from the same team
- Record: [Authors are driver vendor engineers with deep knowledge of
  the hardware]

**Step 3.5: Dependencies**
- The `pm->suspended` flag and `struct mt76_connac_pm` are well-
  established infrastructure present since the mt7921 driver
- No new functions or structures needed
- Record: [No dependencies. Fix uses existing infrastructure available
  in all versions with mt7925.]

## PHASE 4: MAILING LIST RESEARCH

**Step 4.1-4.5**: Lore is blocked by anti-scraping protection. b4 dig
could not find the commit. However, the patch was applied by Felix
Fietkau (mt76 maintainer), indicating it passed review.
- Record: [Could not access lore discussion. Patch was accepted by
  subsystem maintainer.]

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Functions Modified**
- `mt7925_scan_work()` — delayed work handler for processing scan events

**Step 5.2: Callers**
- Registered as delayed work via `INIT_DELAYED_WORK(&dev->phy.scan_work,
  mt7925_scan_work)` in init.c:214
- Queued by `mt7925_mcu_scan_event()` in mcu.c:415 via
  `ieee80211_queue_delayed_work()`
- Cancelled in `mt7925_suspend()` at main.c:1476

**Step 5.3-5.4: Call Chain**
- MCU receives scan event from firmware -> `mt7925_mcu_scan_event()`
  queues skb and schedules `scan_work` -> `mt7925_scan_work()` processes
  scan results
- This is a common path triggered during WiFi scanning, which mac80211
  can trigger automatically

**Step 5.5: Similar Patterns**
- `pm->suspended` checks exist in: `mt792x_mac.c:278` (reset),
  `mt76_connac_mac.c:47,73` (pm_wake, power_save_sched),
  `mt7925/regd.c:196` (regd_change), `mt7921/init.c:147`
- The mt7921 `mt7921_scan_work()` does NOT have this check, which is
  consistent with it being a fix specific to the mt7925 suspend flow
  timing
- Record: [Pattern is well-established across mt76 drivers. mt7921
  doesn't have this check but has different timing characteristics.]

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code Existence**
- mt7925 driver was added in `c948b5da6bbec` which is in v6.7+
- Not in v6.6 (verified: `git merge-base` confirms)
- Present in v6.12 (verified)
- For stable tree 7.0.y (the target here), the driver definitely exists
- Record: [mt7925 exists in v6.7+. Bug present since driver creation.
  Applicable to 6.7.y and later stable trees.]

**Step 6.2: Backport Complications**
- The file has had recent changes (regd_change in scan_work at line
  1361), but the fix adds code at the beginning of the function, which
  should apply cleanly to most versions
- Record: [Expected to apply cleanly — adds code at function entry
  point]

**Step 6.3: Related Fixes in Stable**
- No evidence that this specific fix is already in stable
- Record: [No prior fix for this issue found in stable]

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1**: WiFi driver (drivers/net/wireless/mediatek/mt76/mt7925) —
IMPORTANT subsystem
- MediaTek mt7925 is a common WiFi 7 chip in modern laptops
- Record: [Driver-specific, but widely deployed WiFi hardware. IMPORTANT
  criticality.]

**Step 7.2**: Actively developed subsystem with many recent commits
- Record: [Active development, many suspend/resume fixes indicate real-
  world usage and bug reports]

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1**: Affected: Users of mt7925 WiFi hardware who suspend/resume
their systems (common laptop use case)

**Step 8.2**: Trigger: System suspend while WiFi is active and mac80211
triggers a scan. This is a common scenario on laptops.

**Step 8.3**: Failure mode: Command timeouts during suspend — can cause
suspend failures, WiFi functionality loss after resume. Severity: HIGH
(suspend/resume reliability).

**Step 8.4**:
- BENEFIT: HIGH — prevents command timeouts during suspend for common
  laptop hardware
- RISK: VERY LOW — 8 lines, single early return, follows established
  driver patterns
- Ratio: Strongly favorable

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence FOR**
- Fixes a real command timeout bug during suspend/resume (common laptop
  scenario)
- Small, surgical, obviously correct fix (8 lines, single function)
- Uses well-established `pm->suspended` pattern from the driver family
- Author is driver vendor (MediaTek), applied by subsystem maintainer
  (Felix Fietkau)
- Bug has existed since driver creation (affects all stable trees with
  mt7925)
- No dependencies, standalone fix

**Evidence AGAINST**
- No Fixes: tag (expected)
- Minor concern about queued skbs not being freed on early return
  (acceptable trade-off)
- UNVERIFIED: Could not access lore discussion to check for reviewer
  concerns

**Step 9.2: Stable Rules Checklist**
1. Obviously correct? YES — simple flag check following established
   pattern
2. Fixes real bug? YES — command timeouts during suspend
3. Important issue? YES — suspend/resume reliability on laptop hardware
4. Small and contained? YES — 8 lines, 1 file, 1 function
5. No new features? YES — purely defensive check
6. Can apply to stable? YES — uses existing infrastructure

**Step 9.3: Exception Categories**
- Not an exception category; this is a standard bug fix

**Step 9.4: Decision**
This is a small, well-targeted fix for a real suspend/resume bug
affecting common WiFi hardware. The fix follows established patterns in
the driver family, was accepted by the subsystem maintainer, and carries
minimal regression risk.

## Verification

- [Phase 1] Parsed tags: SOB from MediaTek author, applied by Felix
  Fietkau (mt76 maintainer)
- [Phase 2] Diff: +8 lines in mt7925_scan_work(), adds pm->suspended
  early return check
- [Phase 3] git blame: buggy code from c948b5da6bbec (initial mt7925
  driver, Sep 2023, v6.7)
- [Phase 3] git merge-base: confirmed mt7925 NOT in v6.6, IS in v6.7 and
  v6.12
- [Phase 3] Related commits: multiple suspend fixes (bf39813599b03,
  2d5630b0c9466, 1b97fc8443aea)
- [Phase 5] Grep for pm->suspended: found 15+ uses across mt76 driver
  family, established pattern
- [Phase 5] Grep for scan_work callers: queued by mcu.c:415, cancelled
  in suspend main.c:1476
- [Phase 6] Code exists in v6.7+ stable trees
- [Phase 8] Failure mode: command timeout during suspend, severity HIGH
- UNVERIFIED: Could not access lore.kernel.org discussion due to anti-
  bot protection

**YES**

 drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
index 2d358a96640c9..38474cd2c38d6 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
@@ -1323,10 +1323,18 @@ void mt7925_mlo_pm_work(struct work_struct *work)
 void mt7925_scan_work(struct work_struct *work)
 {
 	struct mt792x_phy *phy;
+	struct mt792x_dev *dev;
+	struct mt76_connac_pm *pm;
 
 	phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy,
 						scan_work.work);
 
+	dev = phy->dev;
+	pm = &dev->pm;
+
+	if (pm->suspended)
+		return;
+
 	while (true) {
 		struct sk_buff *skb;
 		struct tlv *tlv;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-5.10] wifi: mt76: mt76x02: wake queues after reconfig
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (8 preceding siblings ...)
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: Skip scan process during suspend Sasha Levin
@ 2026-04-20 13:19 ` Sasha Levin
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: resolve link after acquiring mt76 mutex Sasha Levin
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:19 UTC (permalink / raw)
  To: patches, stable
  Cc: David Bauer, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: David Bauer <mail@david-bauer.net>

[ Upstream commit 524ef4b42b40bf1cf634663e746ace0af3fce45c ]

The shared reset procedure of MT7610 and MT7612 stop all queues before
starting the reset sequence.

They however never restart these like other supported mt76 chips
do in the reconfig_complete call. This leads to TX not continuing
after the reset.

Restart queues in the reconfig_complete callback to restore
functionality after the reset.

Signed-off-by: David Bauer <mail@david-bauer.net>
Link: https://patch.msgid.link/20251129023904.288484-1-mail@david-bauer.net
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for a comprehensive analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: wifi: mt76: mt76x02
- **Action verb**: "wake" - implies restoring functionality that should
  already have been there
- **Summary**: Wake TX queues after hardware reconfiguration completes
  on MT7610/MT7612

### Step 1.2: Tags
- **Signed-off-by**: David Bauer (author), Felix Fietkau (mt76
  maintainer - accepted the patch)
- **Link**: https://patch.msgid.link/20251129023904.288484-1-mail@david-
  bauer.net
- No Fixes: tag (expected for autosel review candidates)
- No Cc: stable (expected)
- No Reported-by (likely the author discovered it themselves)

### Step 1.3: Commit Body
The commit message clearly describes the bug: MT7610 and MT7612 reset
procedure calls `ieee80211_stop_queues()` at the start but never calls
`ieee80211_wake_queues()` in the restart path. Other mt76 chips do wake
queues in their `reconfig_complete` callback. The consequence is **TX
completely stops after a hardware reset/restart**.

### Step 1.4: Hidden Bug Fix Detection
This is NOT hidden - it is an explicit functional bug fix. TX stops
working after hw reset.
Record: Direct bug fix, not disguised.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **Files**: 1 file changed (`mt76x02_mmio.c`)
- **Lines**: +1 line added
- **Function modified**: `mt76x02_reconfig_complete()`
- **Scope**: Single-file, single-line surgical fix

### Step 2.2: Code Flow Change
- **Before**: `mt76x02_reconfig_complete` only clears `MT76_RESTART`
  state bit, does not wake TX queues
- **After**: Also calls `ieee80211_wake_queues(hw)` to unblock TX after
  reconfiguration

The flow is:
1. `mt76x02_watchdog_reset()` calls `ieee80211_stop_queues()` (line 439)
2. In the `restart` path (lines 518-521), it calls
   `ieee80211_restart_hw()` and sets `MT76_RESTART`
3. mac80211 does full reconfiguration, then calls
   `mt76x02_reconfig_complete()`
4. **Bug**: `reconfig_complete` only clears the state bit but never
   wakes queues
5. TX is permanently stuck

### Step 2.3: Bug Mechanism
This is a **logic/correctness bug** - missing function call on a
critical path. The queues are stopped but never restarted in the hw
restart case.

### Step 2.4: Fix Quality
- **Obviously correct**: Yes - one line adding
  `ieee80211_wake_queues()`, exactly matching what mt7915 and mt7996
  already do in their `reconfig_complete` callbacks
- **Minimal/surgical**: Yes - 1 line
- **Regression risk**: Essentially zero - this is adding a missing queue
  wake that every other mt76 driver already has
Record: Extremely high quality fix, no regression risk.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
The `mt76x02_reconfig_complete` function was introduced by commit
`fd6c2dfa49b762` (Felix Fietkau, 2020-02-16), which was first released
in v5.8-rc1. The function has never been modified since its introduction
- it was missing the `wake_queues` call from day one.

### Step 3.2: Fixes Target
No explicit Fixes: tag, but the bug was introduced by `fd6c2dfa49b762`
("mt76: mt76x02: fix handling MCU timeouts during hw restart") which
created the `reconfig_complete` callback without
`ieee80211_wake_queues()`. This commit exists in all stable trees from
v5.8 onward.

### Step 3.3: File History
36 commits between the buggy code introduction and HEAD. None touch the
`reconfig_complete` function.

### Step 3.4: Author
David Bauer has 5 commits to mt76, mainly focused on MT7915 MCU
improvements. Felix Fietkau (nbd@nbd.name), the mt76 maintainer, signed
off and merged this patch.

### Step 3.5: Dependencies
None. The fix adds a single call to `ieee80211_wake_queues()` which is a
standard mac80211 API available since the very beginning of the mt76
driver. Fully standalone.

---

## PHASE 4: MAILING LIST RESEARCH

### Step 4.1: Patch Discussion
b4 dig found the original submission. The mbox shows a single patch with
no replies/discussion beyond the patch itself. Felix Fietkau accepted
and merged it directly, suggesting it was obviously correct.

### Step 4.2: Reviewers
The patch was CC'd to all relevant mt76 maintainers (Felix Fietkau,
Lorenzo Bianconi, Ryder Lee, Shayne Chen, Sean Wang) and relevant
mailing lists (linux-wireless, linux-mediatek). Felix Fietkau, the
primary mt76 maintainer, directly merged it.

### Step 4.3: Bug Report
No separate bug report - the author discovered the issue.

### Step 4.4: Related Patches
Standalone single patch, not part of a series.

### Step 4.5: Stable Discussion
No stable-specific discussion found.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.2: Key Functions and Callers
`mt76x02_reconfig_complete` is registered as the `.reconfig_complete`
callback in the `ieee80211_ops` structures for both MT7610
(mt76x0/pci.c) and MT7612 (mt76x2/pci_main.c). It is called by the
mac80211 subsystem after `ieee80211_restart_hw()` completes
reconfiguration.

### Step 5.3-5.4: Call Chain
1. TX hang detected by `mt76x02_wdt_work` → `mt76x02_check_tx_hang` →
   `mt76x02_watchdog_reset`
2. Reset stops queues and calls `ieee80211_restart_hw()`
3. mac80211 reconfigures, then calls `mt76x02_reconfig_complete`
4. Without this fix, queues stay stopped → no more TX

This is triggered on real hardware when TX hangs occur, which is a known
scenario for these WiFi chips.

### Step 5.5: Similar Patterns
Both `mt7915_reconfig_complete` and `mt7996_reconfig_complete` call
`ieee80211_wake_queues(hw)` as their first action - confirming this is
the expected pattern that was simply missed for mt76x02.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Buggy Code Presence
The buggy `mt76x02_reconfig_complete` was introduced in v5.8-rc1 (commit
fd6c2dfa49b762). It exists in all active stable trees: 5.10.y, 5.15.y,
6.1.y, 6.6.y, 6.12.y, etc.

### Step 6.2: Backport Complexity
This is a single-line addition. The surrounding code
(`mt76x02_reconfig_complete`) has not been modified since it was
introduced in 2020. Clean apply expected in all stable trees.

### Step 6.3: Related Fixes
No related or alternative fixes found in stable trees.

---

## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT

### Step 7.1: Subsystem
- **Path**: drivers/net/wireless/mediatek/mt76/
- **Subsystem**: WiFi driver (MediaTek MT7610/MT7612)
- **Criticality**: IMPORTANT - MT7610 and MT7612 are popular WiFi
  chipsets used in many consumer routers, access points, and USB
  adapters (especially in OpenWrt/embedded Linux)

### Step 7.2: Activity
The mt76 subsystem is actively developed with 20+ recent commits.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Users
All users of MT7610 (mt76x0) and MT7612 (mt76x2) PCIe WiFi devices.
These are common in embedded routers and access points.

### Step 8.2: Trigger
After any hardware reset triggered by TX hang detection (a real-world
scenario), the WiFi device becomes completely unable to transmit. The
watchdog periodically runs (`mt76x02_wdt_work`), and TX hangs do occur
in real hardware.

### Step 8.3: Severity
**CRITICAL** - Complete loss of WiFi TX functionality after any hw
restart. The device appears connected but cannot send any data.

### Step 8.4: Risk-Benefit
- **Benefit**: Very high - restores WiFi functionality after hw reset
  for MT7610/MT7612 users
- **Risk**: Very low - single line addition of
  `ieee80211_wake_queues()`, matching behavior of all other mt76 drivers
- **Ratio**: Excellent risk-benefit ratio

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence
**FOR backporting:**
- Fixes complete loss of TX functionality after hw restart (complete
  WiFi outage)
- Single-line fix, obviously correct
- Matches established pattern in mt7915, mt7996, and other mt76 drivers
- Merged by the mt76 subsystem maintainer (Felix Fietkau)
- Bug present since v5.8-rc1 (2020), affects all stable trees
- No dependencies, applies cleanly
- Affects popular WiFi hardware (MT7610/MT7612)

**AGAINST backporting:**
- No explicit Cc: stable or Fixes: tag (expected for autosel)
- No user bug reports cited (but the failure mode is severe enough to
  warrant the fix regardless)

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** - single line, matches all
   other mt76 drivers, accepted by maintainer
2. Fixes a real bug? **YES** - TX stops working after hw restart
3. Important issue? **YES** - complete loss of WiFi functionality
4. Small and contained? **YES** - 1 line in 1 file
5. No new features/APIs? **YES** - just adds missing call
6. Applies to stable? **YES** - code unchanged since v5.8

### Step 9.3: Exception Categories
Not needed - this is a straightforward bug fix.

### Step 9.4: Decision
This is a textbook stable backport candidate: a single-line fix for a
severe functional bug (complete WiFi TX outage after hw restart),
matching established patterns from other drivers, in code that exists in
all stable trees.

---

## Verification

- [Phase 1] Parsed tags: Signed-off-by from David Bauer (author) and
  Felix Fietkau (mt76 maintainer). Link to lore.kernel.org confirmed.
- [Phase 2] Diff analysis: exactly 1 line added
  (`ieee80211_wake_queues(hw)`) in `mt76x02_reconfig_complete()`.
- [Phase 3] git blame: buggy function introduced in commit
  fd6c2dfa49b762 (Felix Fietkau, 2020-02-16), first in v5.8-rc1.
  Function never modified since.
- [Phase 3] git show fd6c2dfa49b762: confirmed this commit created
  `mt76x02_reconfig_complete` without `ieee80211_wake_queues()`.
- [Phase 3] git describe --contains fd6c2dfa49b762: confirmed
  "v5.8-rc1~165^2~111^2~13^2~137"
- [Phase 4] b4 dig -c 524ef4b42b40b: found original submission at lore
  URL. Single patch, no series.
- [Phase 4] b4 dig -w: confirmed all relevant mt76 maintainers were
  CC'd.
- [Phase 4] mbox thread: no review comments, directly merged by Felix
  Fietkau.
- [Phase 5] grep reconfig_complete: confirmed mt7915 and mt7996 both
  call `ieee80211_wake_queues(hw)` in their reconfig_complete - mt76x02
  was the outlier.
- [Phase 5] grep ieee80211_stop_queues/wake_queues: confirmed stop at
  line 439, wake only in non-restart path at line 523. Restart path
  missing wake (the bug).
- [Phase 6] Code exists in all active stable trees (v5.8+). No
  modifications to the function since introduction.
- [Phase 8] Failure mode: complete WiFi TX loss after hw restart,
  severity CRITICAL.

**YES**

 drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
index dd71c1c95cc9b..dc7c03d231238 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
+++ b/drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c
@@ -534,6 +534,7 @@ void mt76x02_reconfig_complete(struct ieee80211_hw *hw,
 		return;
 
 	clear_bit(MT76_RESTART, &dev->mphy.state);
+	ieee80211_wake_queues(hw);
 }
 EXPORT_SYMBOL_GPL(mt76x02_reconfig_complete);
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: resolve link after acquiring mt76 mutex
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (9 preceding siblings ...)
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: mt76: mt76x02: wake queues after reconfig Sasha Levin
@ 2026-04-20 13:19 ` Sasha Levin
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix queue pause after scan due to wrong channel switch reason Sasha Levin
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:19 UTC (permalink / raw)
  To: patches, stable
  Cc: Sean Wang, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Sean Wang <sean.wang@mediatek.com>

[ Upstream commit beec58f36983f826fe90287a90edff46b32e8a89 ]

mt792x_sta_to_link() uses rcu_dereference_protected() and therefore
expects mt76.mutex to be held. Move the lookup after
mt792x_mutex_acquire() to make the locking explicit and correct.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/20260306232238.2039675-12-sean.wang@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for a thorough analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: `wifi: mt76: mt7925` (MediaTek MT7925 WiFi driver in
  the mt76 subsystem)
- **Action verb**: "resolve" — implies correcting something that was
  wrong
- **Summary**: Move the `mt792x_sta_to_link()` lookup after acquiring
  `mt76.mutex`, because the function uses `rcu_dereference_protected()`
  which requires that lock.

Record: [wifi: mt76: mt7925] [resolve/correct] [Fix locking order:
acquire mutex before RCU-protected dereference]

### Step 1.2: Tags
- **Signed-off-by**: Sean Wang <sean.wang@mediatek.com> — the author, a
  prominent mt76/mt7925 contributor
- **Link**: https://patch.msgid.link/20260306232238.2039675-12-
  sean.wang@kernel.org — patch 12 (of 20 messages, patch 11/19)
- **Signed-off-by**: Felix Fietkau <nbd@nbd.name> — the mt76 subsystem
  maintainer who applied the patch
- No Fixes: tag, no Reported-by:, no Cc: stable — expected for candidate
  review

Record: Author is Sean Wang (regular mt76 contributor), applied by the
mt76 maintainer Felix Fietkau. No explicit Fixes: or stable tags.

### Step 1.3: Commit Body Analysis
The body says: "`mt792x_sta_to_link()` uses
`rcu_dereference_protected()` and therefore expects `mt76.mutex` to be
held." This directly describes a locking contract violation. The fix:
"Move the lookup after `mt792x_mutex_acquire()`."

Record: Bug: calling `rcu_dereference_protected()` without holding the
required lock. Symptom: lockdep warning if `CONFIG_PROVE_LOCKING` is
enabled; potential race condition for MLO vifs where the RCU pointer
could be concurrently modified.

### Step 1.4: Hidden Bug Fix Detection
This IS a bug fix despite not using the word "fix" — the commit corrects
a lock ordering violation. The `rcu_dereference_protected()` API
explicitly expects the lock to be held, and calling it without it is
incorrect.

Record: Yes, this is a real bug fix — locking correctness violation.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **File**: `drivers/net/wireless/mediatek/mt76/mt7925/main.c`
- **Function**: `mt7925_mac_link_sta_assoc()`
- **Change**: 2 lines moved (net 0 lines added/removed — pure reorder)
- **Scope**: Single function, single file, surgical

Record: 1 file changed, ~4 lines reordered within one function. Scope:
trivially small.

### Step 2.2: Code Flow Change
**Before**: `msta` and `mlink` were resolved via `mt792x_sta_to_link()`
before `mt792x_mutex_acquire(dev)` was called.

**After**: `mt792x_mutex_acquire(dev)` is called first, then `msta` and
`mlink` are resolved.

The `msta` assignment (`(struct mt792x_sta *)link_sta->sta->drv_priv`)
does not need the lock, but moving it below the mutex acquire is
harmless and keeps the code cleaner.

### Step 2.3: Bug Mechanism
This is a **synchronization / locking correctness fix**. The function
`mt792x_sta_to_link()` uses:

```293:294:drivers/net/wireless/mediatek/mt76/mt792x.h
return rcu_dereference_protected(msta->link[link_id],
        lockdep_is_held(&msta->vif->phy->dev->mt76.mutex));
```

The `rcu_dereference_protected()` call asserts that `mt76.mutex` must be
held. Calling it without the lock is:
1. A lockdep assertion violation (runtime warning with
   `CONFIG_PROVE_LOCKING`)
2. A potential race: without the mutex, the `msta->link[link_id]` RCU
   pointer could be concurrently modified (e.g., during link teardown),
   leading to use-after-free.

Note: For non-MLD (non-WiFi-7-MLO) vifs, the function returns
`&msta->deflink` early without touching RCU, so the actual RCU race only
applies to MLO connections.

Record: Synchronization/locking fix. `rcu_dereference_protected()`
called without required mutex. Race window for concurrent link
modification on MLO vifs.

### Step 2.4: Fix Quality
- **Obviously correct**: Yes — just reorders two existing operations.
- **Minimal**: Yes — net zero lines changed.
- **Regression risk**: Essentially zero. The mutex is acquired slightly
  earlier but still released at the same point. No new code is added.

Record: Fix is trivially correct with no regression risk.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
From the git blame, the buggy code was introduced by commit
`89397bccc882a4` ("wifi: mt76: mt7925: add mt7925_mac_link_sta_assoc to
associate per-link STA") by Sean Wang, dated 2024-06-12. This commit
first appeared in `v6.11-rc1`.

Record: Buggy code introduced in 89397bccc882a4, first in v6.11-rc1.
Present in stable trees v6.11+.

### Step 3.2: Fixes Tag
No explicit Fixes: tag, but the bug was clearly introduced by
`89397bccc882a4`.

Record: The original commit created the function
`mt7925_mac_link_sta_assoc` with the lock ordering issue from day one.

### Step 3.3: File History
The file has had ~46 commits since v6.11. Several are related locking
fixes (e.g., `9f15701370ec1` "fix locking in
mt7925_change_vif_links()").

Record: Active file with other locking fixes. This is standalone.

### Step 3.4: Author
Sean Wang is a prolific contributor to mt76/mt7925 and the author of the
MLO link support. He's also the author of the original buggy commit, so
this is the author fixing their own oversight.

Record: Author is the subsystem developer who introduced the bug.

### Step 3.5: Dependencies
This is patch 11/19 in a series, but the fix itself is **self-
contained**: it only reorders existing lines within one function. It has
no dependency on any other patch in the series.

Record: Self-contained, no dependencies.

---

## PHASE 4: MAILING LIST RESEARCH

### Step 4.1: Original Discussion
The patch was submitted as `[PATCH 11/19]` in a series titled "wifi:
mt76: mt7925: fix up MLO link lifetime and error handling". The series
includes a mix of refactoring and fixes. This specific patch is a pure
correctness fix.

Record: Part of a 19-patch series for MLO link lifetime improvements.
This patch is standalone.

### Step 4.2: Reviewers
The series was sent to `nbd@nbd.name` (Felix Fietkau, mt76 maintainer)
and `lorenzo.bianconi@redhat.com`. Applied by Felix Fietkau.

Record: Applied by subsystem maintainer.

### Step 4.3-4.5: Bug Reports / Stable Discussion
No specific bug report or syzbot link. No specific stable discussion
found. The kernel test robot reported build issues on patch 2/19 only,
not on this patch.

Record: No external bug reports. The issue is self-evident from code
inspection.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1: Key Functions
- `mt7925_mac_link_sta_assoc()` — the function being fixed
- `mt792x_sta_to_link()` — the inline function that requires the mutex

### Step 5.2: Callers
`mt7925_mac_link_sta_assoc()` is called from `mt7925_mac_sta_event()`
(line 1078), which is exported via `EXPORT_SYMBOL_GPL` and called during
station association events via the mac80211 callback path. This is a
common WiFi operational path.

### Step 5.4: Reachability
The code path is: mac80211 sta_event callback ->
`mt7925_mac_sta_event()` -> `mt7925_mac_link_sta_assoc()`. This is
triggered during WiFi association, which is a very common operation.

Record: The buggy code is on a common WiFi association path, reachable
during normal operation.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Presence in Stable Trees
- Buggy commit `89397bccc882a4` is NOT in v6.10 (verified)
- It IS in v6.11+ (verified: v6.11-rc1)
- Relevant stable trees: 6.11.y, 6.12.y, and any later LTS

### Step 6.2: Backport Complications
The patch is a trivial reorder of existing lines. It should apply
cleanly to any tree that has the buggy commit.

Record: Clean apply expected for all trees with the buggy code (v6.11+).

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1: Subsystem Criticality
- Subsystem: `drivers/net/wireless/mediatek/mt76/mt7925/` — WiFi driver
- Criticality: IMPORTANT — MediaTek MT7925 is a WiFi 7 chip used in many
  modern laptops
- The fix is specifically for the MLO (Multi-Link Operation) code path

Record: IMPORTANT subsystem — popular WiFi 7 chip. Bug affects MLO
connections.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Who Is Affected
Users of MediaTek MT7925 WiFi 7 hardware using MLO (Multi-Link
Operation). For non-MLO connections, `mt792x_sta_to_link()` takes the
early `deflink` return path and doesn't touch RCU.

### Step 8.2: Trigger Conditions
- Triggered during WiFi association with MLO
- Common operation for WiFi 7 users
- With `CONFIG_PROVE_LOCKING`: always triggers a warning
- Without: race window exists but may be hard to hit

### Step 8.3: Failure Mode Severity
- **With lockdep**: WARNING (lock assertion failure) — MEDIUM
- **Without lockdep**: Potential use-after-free if link is concurrently
  modified — HIGH (crash/corruption risk, though race window is small)

### Step 8.4: Risk-Benefit Ratio
- **Benefit**: Fixes locking correctness and prevents lockdep warnings +
  potential UAF for MLO users. LOW-MEDIUM benefit.
- **Risk**: Essentially ZERO risk — the fix is a trivial reorder of two
  existing operations.
- **Ratio**: Very favorable — zero risk for a real correctness fix.

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence Summary

**FOR backporting:**
- Fixes a real locking correctness violation
  (`rcu_dereference_protected()` without lock)
- The fix is trivially correct — just reorders 2 lines
- Zero regression risk
- Self-contained, no dependencies on other patches
- Applied by subsystem maintainer
- Bug exists in stable trees v6.11+
- On a common WiFi association code path

**AGAINST backporting:**
- Part of a 19-patch series (though this patch is standalone)
- Primarily affects WiFi 7 MLO users (subset of mt7925 users)
- Without lockdep, the actual race is theoretically possible but hard to
  trigger
- No user reports or syzbot reports of the issue

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** — trivial reorder, applied by
   maintainer
2. Fixes a real bug? **YES** — locking correctness violation
3. Important issue? **MEDIUM** — lockdep warning + potential UAF race on
   MLO
4. Small and contained? **YES** — ~4 lines reordered in one function
5. No new features or APIs? **YES** — pure fix
6. Can apply to stable trees? **YES** — should apply cleanly to v6.11+

### Step 9.3: Exception Categories
Not an exception category — this is a standard bug fix.

### Step 9.4: Decision
The fix is a trivially correct locking fix with zero regression risk. It
fixes a real `rcu_dereference_protected()` locking violation that
triggers lockdep warnings and has a theoretical UAF race. While the
impact is limited to WiFi 7 MLO users, the fix is so small and safe that
the risk-benefit ratio strongly favors backporting.

---

## Verification

- [Phase 1] Parsed tags: Author is Sean Wang (MediaTek), applied by
  Felix Fietkau (mt76 maintainer)
- [Phase 2] Diff analysis: ~4 lines reordered in
  `mt7925_mac_link_sta_assoc()` — moves mutex_acquire before
  `mt792x_sta_to_link()`
- [Phase 2] Verified `mt792x_sta_to_link()` uses
  `rcu_dereference_protected()` with `lockdep_is_held(&mt76.mutex)` at
  lines 293-294 of mt792x.h
- [Phase 2] Verified `mt792x_mutex_acquire()` expands to
  `mutex_lock(&dev->mutex)` via `mt76_connac_mutex_acquire()` at line
  401 of mt76_connac.h
- [Phase 3] git blame: buggy code introduced by 89397bccc882a4 (Sean
  Wang, 2024-06-12)
- [Phase 3] git describe: buggy commit first appears in v6.11-rc1
- [Phase 3] git merge-base: confirmed NOT in v6.10, YES in v6.11 and
  v6.12
- [Phase 4] b4 mbox: found series cover "fix up MLO link lifetime and
  error handling" (19 patches)
- [Phase 4] This is patch 11/19, but is self-contained
- [Phase 5] `mt7925_mac_link_sta_assoc()` called from
  `mt7925_mac_sta_event()` (line 1078), triggered during WiFi
  association
- [Phase 6] Code exists in stable trees v6.11+
- [Phase 8] Failure mode: lockdep WARNING (MEDIUM) or potential UAF race
  on MLO (HIGH but hard to trigger)
- UNVERIFIED: Could not access lore.kernel.org directly due to anti-bot
  protection to check for reviewer comments on patch 11/19 specifically

**YES**

 drivers/net/wireless/mediatek/mt76/mt7925/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
index 38474cd2c38d6..caafac14d97ba 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c
@@ -1030,11 +1030,11 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev,
 	struct mt792x_link_sta *mlink;
 	struct mt792x_sta *msta;
 
+	mt792x_mutex_acquire(dev);
+
 	msta = (struct mt792x_sta *)link_sta->sta->drv_priv;
 	mlink = mt792x_sta_to_link(msta, link_sta->link_id);
 
-	mt792x_mutex_acquire(dev);
-
 	if (ieee80211_vif_is_mld(vif)) {
 		link_conf = mt792x_vif_to_bss_conf(vif, msta->deflink_id);
 	} else {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix queue pause after scan due to wrong channel switch reason
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (10 preceding siblings ...)
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: resolve link after acquiring mt76 mutex Sasha Levin
@ 2026-04-20 13:19 ` Sasha Levin
  2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628 Sasha Levin
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:19 UTC (permalink / raw)
  To: patches, stable
  Cc: StanleyYP Wang, Shayne Chen, Felix Fietkau, Sasha Levin, lorenzo,
	ryder.lee, matthias.bgg, angelogioacchino.delregno,
	linux-wireless, linux-kernel, linux-arm-kernel, linux-mediatek

From: StanleyYP Wang <StanleyYP.Wang@mediatek.com>

[ Upstream commit 1f9017d19db38ad2cb9bedb5b078f6f4f60afa94 ]

Previously, we used the IEEE80211_CONF_IDLE flag to avoid setting the
parking channel with the CH_SWITCH_NORMAL reason, which could trigger TX
emission before bootup CAC.

However, we found that this flag can be set after triggering scanning on a
connected station interface, and the reason CH_SWITCH_SCAN_BYPASS_DPD will
be used when switching back to the operating channel, which makes the
firmware failed to resume paused AC queues.

Seems that we should avoid relying on this flag after switching to single
multi-radio architecture. Instead, use the existence of chanctx as the
condition.

Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20260203155532.1098290-4-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Here is my complete analysis:

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: `wifi: mt76: mt7996:` (MediaTek mt7996 WiFi 7 chipset
  driver)
- **Action verb**: "fix" — explicit bug fix
- **Summary**: Fixes queue pause after scan due to wrong channel switch
  reason being sent to firmware

Record: [wifi: mt76: mt7996] [fix] [queue pause after scan due to wrong
channel switch reason]

### Step 1.2: Tags
- **Signed-off-by**: StanleyYP Wang (author, MediaTek engineer), Shayne
  Chen (MediaTek, submitter), Felix Fietkau (mt76 subsystem maintainer)
- **Link**: `https://patch.msgid.link/20260203155532.1098290-4-
  shayne.chen@mediatek.com` — patch 4 of a series
- No Fixes: tag (expected)
- No Reported-by tag
- No Cc: stable tag (expected)

Record: Patch is from MediaTek engineers (hardware vendor), signed off
by the mt76 maintainer Felix Fietkau. Part of a series (patch 4).

### Step 1.3: Commit Body Analysis
The commit explains:
1. **Previous approach**: Used `IEEE80211_CONF_IDLE` flag to avoid
   setting parking channel with `CH_SWITCH_NORMAL` reason (which could
   trigger TX emission before bootup CAC).
2. **Bug discovered**: After scanning on a connected station interface,
   the `IEEE80211_CONF_IDLE` flag can be set. When switching back to the
   operating channel, the wrong reason `CH_SWITCH_SCAN_BYPASS_DPD` is
   used, causing firmware to fail to resume paused AC queues.
3. **Fix**: Use the existence of `chanctx` (channel context) instead of
   the IDLE flag, which is more appropriate for the multi-radio
   architecture.

Record: Bug causes TX queues to remain paused after scan on a connected
station interface. Firmware-level failure to resume AC queues. Root
cause is the `IEEE80211_CONF_IDLE` flag being unreliable after the
multi-radio architecture switch.

### Step 1.4: Hidden Bug Fix Detection
Not hidden — explicitly labeled "fix" with clear bug mechanism
described.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **Files changed**: 1
  (`drivers/net/wireless/mediatek/mt76/mt7996/mcu.c`)
- **Lines changed**: 2 lines modified (1 removed, 1 added — net -1 line)
- **Function modified**: `mt7996_mcu_set_chan_info()`
- **Scope**: Single-file, single-line surgical fix

### Step 2.2: Code Flow Change
The only change is in the condition for setting `switch_reason`:

Before:
```c
else if (phy->mt76->offchannel ||
         phy->mt76->hw->conf.flags & IEEE80211_CONF_IDLE)
    req.switch_reason = CH_SWITCH_SCAN_BYPASS_DPD;
```

After:
```c
else if (phy->mt76->offchannel || !phy->mt76->chanctx)
    req.switch_reason = CH_SWITCH_SCAN_BYPASS_DPD;
```

The `IEEE80211_CONF_IDLE` flag check is replaced by
`!phy->mt76->chanctx` (channel context is NULL). Both mean "no active
operating channel," but `chanctx` is the correct indicator in the multi-
radio architecture.

### Step 2.3: Bug Mechanism
**Logic/correctness fix**: The condition for determining which channel
switch reason to send to firmware was wrong. The `IEEE80211_CONF_IDLE`
flag can be spuriously set after scanning on a connected station,
causing the firmware to use `CH_SWITCH_SCAN_BYPASS_DPD` instead of
`CH_SWITCH_NORMAL` when returning to the operating channel. This makes
firmware fail to resume paused TX AC queues.

### Step 2.4: Fix Quality
- **Obviously correct**: Yes — `chanctx` directly indicates if a channel
  context exists, which is the semantic meaning needed.
- **Minimal/surgical**: Yes — 1 line change.
- **Regression risk**: Very low — `chanctx` is NULL only when no channel
  context is assigned, which is semantically equivalent to (and more
  accurate than) the IDLE flag check.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
- The `IEEE80211_CONF_IDLE` check was introduced in commit
  `413f05d68d1198` (StanleyYP Wang, 2023-08-31, first in v6.7-rc1):
  "wifi: mt76: get rid of false alarms of tx emission issues"
- The `offchannel` field was introduced in `f4fdd7716290a2` (Felix
  Fietkau, 2024-08-28, first in v6.12-rc1): "wifi: mt76: partially move
  channel change code to core"
- The `chanctx` field and multi-radio architecture was introduced in
  commits `82334623af0cd` and `69d54ce7491d` (Felix Fietkau, 2025-01-02,
  first in v6.14-rc1)

Record: The bug only manifests from v6.14 onwards (when multi-radio
architecture was introduced and chanctx is used). The IDLE flag check
was fine before the architecture change.

### Step 3.2: No Fixes: tag present (expected).

### Step 3.3: File History
The mcu.c file is actively maintained with many recent fixes. The fix is
self-contained and standalone.

### Step 3.4: Author Context
StanleyYP Wang (author) is a regular MediaTek contributor working on
mt76 radar/DFS/channel features. Shayne Chen is the primary MediaTek
mt7996 contributor. Felix Fietkau is the mt76 subsystem maintainer who
signed off.

### Step 3.5: Dependencies
The fix uses `phy->mt76->chanctx` which exists in all trees from v6.14
onwards. No other dependencies needed.

---

## PHASE 4: MAILING LIST RESEARCH

Lore.kernel.org and patch.msgid.link are protected by Anubis anti-
scraping, so web fetch failed. b4 dig could not find the commit (it's a
candidate, not yet in tree).

The Link: URL (`20260203155532.1098290-4-shayne.chen@mediatek.com`)
shows this is patch 4 of a series, but the fix is completely self-
contained — it only changes one condition in one function.

Record: Could not access lore discussion due to anti-bot protection.
Patch 4 of a series, but standalone.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.4: Function Call Analysis
`mt7996_mcu_set_chan_info()` is called from:
1. `mt7996_run()` (line 25 in main.c) — during interface start, with
   `UNI_CHANNEL_RX_PATH`
2. `mt7996_set_channel()` (lines 561, 565 in main.c) — during channel
   switch, with both `UNI_CHANNEL_SWITCH` and `UNI_CHANNEL_RX_PATH`

The `mt7996_set_channel()` path is the critical one — this is called
during scan return (switching back to operating channel). This is a hot
path triggered by every scan operation.

Record: Function called on every channel switch, including post-scan
return. Bug affects all users who scan while connected.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Buggy Code in Stable Trees
- **v6.14.y through v6.19.y**: YES — all contain the buggy
  `IEEE80211_CONF_IDLE` check and have `chanctx` infrastructure
  available.
- **v6.12.y and earlier**: NO — don't have multi-radio architecture; the
  bug doesn't exist there (different code path).

Verified: v6.14.11 and v6.19.12 both have the exact same buggy code and
have the `chanctx` field available.

### Step 6.2: Backport Complications
The fix should apply cleanly to all affected stable trees (v6.14.y
through v6.19.y). The surrounding code context matches exactly.

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1: Subsystem
- **Subsystem**: WiFi driver (drivers/net/wireless/mediatek/mt76/mt7996)
- **Criticality**: IMPORTANT — mt7996 is MediaTek's WiFi 7 chipset used
  in many routers and APs
- **Activity**: Very active development (many recent commits)

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Population
Users of mt7996-based WiFi hardware (WiFi 7 routers, APs, and client
devices) running kernels v6.14+.

### Step 8.2: Trigger Conditions
Triggering a scan while connected as a station interface. This is a
**very common operation** — background scanning happens regularly for
roaming decisions, network discovery, etc.

### Step 8.3: Failure Mode Severity
TX AC queues remain paused indefinitely after scan. This means **the
WiFi interface effectively stops transmitting data**. Severity: **HIGH**
— complete loss of WiFi transmit functionality until manual
intervention.

### Step 8.4: Risk-Benefit Ratio
- **Benefit**: HIGH — prevents complete WiFi TX failure on mt7996
  hardware after scanning
- **Risk**: VERY LOW — 1-line condition change, replacing one "no active
  channel" check with a more accurate one
- **Ratio**: Very favorable

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence Summary

**FOR backporting:**
- Fixes a real, functional bug: TX queues stuck after scan
- Very high impact: WiFi effectively stops working (no transmit)
- Common trigger: scanning while connected as station
- Minimal change: 1 line, obviously correct
- From the hardware vendor (MediaTek) and signed off by subsystem
  maintainer (Felix Fietkau)
- Applies cleanly to affected stable trees (v6.14.y through v6.19.y)
- No dependencies on other patches
- Zero regression risk

**AGAINST backporting:**
- No Fixes: tag (expected — that's why it needs review)
- Part of a series (patch 4), but the fix is self-contained
- Only affects v6.14+ trees (bug doesn't exist in older LTS trees)

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **YES** — trivial condition
   replacement, from hardware vendor
2. Fixes a real bug? **YES** — TX queues stuck = complete WiFi transmit
   failure
3. Important issue? **YES** — effectively disables WiFi after common
   operation
4. Small and contained? **YES** — 1 line change in 1 file
5. No new features or APIs? **CORRECT** — no new features
6. Can apply to stable trees? **YES** — applies cleanly to v6.14.y
   through v6.19.y

### Step 9.3: Exception Categories
Not needed — meets standard stable criteria as a bug fix.

---

## Verification

- [Phase 1] Parsed subject: explicit "fix" for queue pause after scan,
  WiFi mt76/mt7996 subsystem
- [Phase 2] Diff: 1 line changed in `mt7996_mcu_set_chan_info()`,
  replaces `IEEE80211_CONF_IDLE` flag with `!chanctx` check
- [Phase 3] git blame: `IEEE80211_CONF_IDLE` check added in
  413f05d68d1198 (v6.7-rc1), became buggy after multi-radio switch in
  69d54ce7491d (v6.14-rc1)
- [Phase 3] git describe --contains: chanctx infra first in v6.14-rc1,
  confirmed via `git merge-base --is-ancestor`
- [Phase 3] git show 413f05d68d1198: confirmed original CONF_IDLE commit
  purpose was DFS CAC false alarm avoidance
- [Phase 5] Grep callers: `mt7996_mcu_set_chan_info()` called from
  `mt7996_run()` and `mt7996_set_channel()` — channel switch path
  triggered on every scan return
- [Phase 6] git show v6.14.11/v6.19.12: confirmed buggy code present in
  both stable trees, chanctx field available
- [Phase 6] v6.12.y: does NOT have the bug (no multi-radio architecture,
  different code)
- [Phase 7] Felix Fietkau confirmed as mt76 maintainer (signed off on
  the fix)
- UNVERIFIED: Could not access lore.kernel.org discussion due to anti-
  bot protection

The fix is a clear, minimal, single-line correction that prevents WiFi
TX queues from permanently stalling after scan operations on mt7996
hardware. It meets all stable kernel criteria.

**YES**

 drivers/net/wireless/mediatek/mt76/mt7996/mcu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
index 0abe5efa9424e..470e69eacb7da 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
@@ -3748,8 +3748,7 @@ int mt7996_mcu_set_chan_info(struct mt7996_phy *phy, u16 tag)
 
 	if (phy->mt76->hw->conf.flags & IEEE80211_CONF_MONITOR)
 		req.switch_reason = CH_SWITCH_NORMAL;
-	else if (phy->mt76->offchannel ||
-		 phy->mt76->hw->conf.flags & IEEE80211_CONF_IDLE)
+	else if (phy->mt76->offchannel || !phy->mt76->chanctx)
 		req.switch_reason = CH_SWITCH_SCAN_BYPASS_DPD;
 	else if (!cfg80211_reg_can_beacon(phy->mt76->hw->wiphy, chandef,
 					  NL80211_IFTYPE_AP))
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (11 preceding siblings ...)
  2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix queue pause after scan due to wrong channel switch reason Sasha Levin
@ 2026-04-20 13:20 ` Sasha Levin
  2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: mt7996: reset device after MCU message timeout Sasha Levin
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:20 UTC (permalink / raw)
  To: patches, stable
  Cc: Joris Vaisvila, Daniel Golle, Stefan Roese, Jakub Kicinski,
	Sasha Levin, nbd, lorenzo, andrew+netdev, davem, edumazet, pabeni,
	matthias.bgg, angelogioacchino.delregno, netdev, linux-kernel,
	linux-arm-kernel, linux-mediatek

From: Joris Vaisvila <joey@tinyisr.com>

[ Upstream commit 9a04d3b2f0708a9e5e1f731bafb69b040bb934a0 ]

The MT7628 has a fixed-link PHY and does not expose MAC control
registers. Writes to these registers only corrupt the ESW VLAN
configuration.

This patch explicitly registers no-op phylink_mac_ops for MT7628, as
after removing the invalid register accesses, the existing
phylink_mac_ops effectively become no-ops.

This code was introduced by commit 296c9120752b
("net: ethernet: mediatek: Add MT7628/88 SoC support")

Signed-off-by: Joris Vaisvila <joey@tinyisr.com>
Reviewed-by: Daniel Golle <daniel@makrotpia.org>
Reviewed-by: Stefan Roese <stefan.roese@mailbox.org>
Link: https://patch.msgid.link/20260226154547.68553-1-joey@tinyisr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for a thorough analysis. Let me
compile my findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `net: ethernet: mtk_eth_soc`
- Action verb: "avoid" — strongly implies a bug fix (preventing harmful
  behavior)
- Summary: Avoid writing to ESW (Embedded Switch) registers on MT7628 to
  prevent corruption

**Step 1.2: Tags**
- No `Fixes:` tag, but commit body references `296c9120752b` ("net:
  ethernet: mediatek: Add MT7628/88 SoC support") from August 2019 as
  the introducing commit
- `Signed-off-by: Joris Vaisvila <joey@tinyisr.com>` — author
- `Reviewed-by: Daniel Golle <daniel@makrotpia.org>` — MediaTek ethernet
  maintainer/expert
- `Reviewed-by: Stefan Roese <stefan.roese@mailbox.org>` — original
  author of the MT7628 support commit
- `Link:` to patch.msgid.link (standard netdev submission)
- `Signed-off-by: Jakub Kicinski <kuba@kernel.org>` — netdev maintainer
  applied it

Record: Two reviewer tags from highly relevant people (original MT7628
author + subsystem expert). No syzbot. No explicit Cc: stable.

**Step 1.3: Commit Body**
- Bug: MT7628 has a fixed-link PHY and does not expose MAC control
  registers. Writes to `MTK_MAC_MCR(x)` (offset 0x10100) on MT7628 hit
  the ESW VLAN configuration instead of non-existent MAC control
  registers.
- Symptom: VLAN configuration corruption on MT7628
- Root cause: The phylink_mac_ops callbacks (`link_down`, `link_up`,
  `mac_finish`) write to `MTK_MAC_MCR` registers without checking for
  MT7628

**Step 1.4: Hidden Bug Fix Detection**
This is clearly a data corruption fix. The word "avoid" means preventing
invalid register writes that corrupt VLAN config.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- Single file: `drivers/net/ethernet/mediatek/mtk_eth_soc.c`
- Approximate: +27 lines added, -5 lines removed
- Functions modified: `mtk_mac_config` (guard removed), `mtk_add_mac`
  (ops selection added)
- Functions added: `rt5350_mac_config`, `rt5350_mac_link_down`,
  `rt5350_mac_link_up` (all no-ops), `rt5350_phylink_ops` (new ops
  struct)

**Step 2.2: Code Flow Change**
1. In `mtk_mac_config`: The `!MTK_HAS_CAPS(eth->soc->caps,
   MTK_SOC_MT7628)` guard was removed. Safe because MT7628 now uses
   entirely different (no-op) ops, so this function is never called for
   MT7628.
2. In `mtk_add_mac`: Added conditional to select `rt5350_phylink_ops`
   for MT7628 instead of `mtk_phylink_ops`.
3. New no-op functions: `rt5350_mac_config`, `rt5350_mac_link_down`,
   `rt5350_mac_link_up` — all empty.

**Step 2.3: Bug Mechanism**
Category: **Hardware workaround / data corruption fix**

The bug: On MT7628, register offset 0x10100 is part of the ESW VLAN
configuration, not a MAC control register. The existing
`mtk_mac_link_down()`, `mtk_mac_link_up()`, and `mtk_mac_finish()` all
write to `MTK_MAC_MCR(mac->id)` (= 0x10100) without MT7628 checks. Only
`mtk_mac_config()` had a guard. Every link state change event corrupts
the VLAN configuration.

**Step 2.4: Fix Quality**
- Obviously correct: The fix prevents ALL register writes by
  substituting no-op callbacks
- Minimal regression risk: Empty callbacks for a fixed-link PHY that
  never needed MAC configuration
- Self-contained in one file
- Reviewed by the original MT7628 author (Stefan Roese) and MediaTek
  network expert (Daniel Golle)

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
- The buggy code in `mtk_mac_link_down`/`mtk_mac_link_up` was introduced
  by `b8fc9f30821ec0` (René van Dorst, 2019-08-25) during the phylink
  conversion
- The `mtk_mac_config` guard was already in `b8fc9f30821ec0` but was
  never added to `link_down`/`link_up`/`finish`

**Step 3.2: Original commit**
- `296c9120752b` ("Add MT7628/88 SoC support") was merged in v5.3-rc6
  (August 2019)
- This commit is present in all stable trees from v5.3 onwards
  (confirmed in p-5.10, p-5.15 tags)

**Step 3.3/3.4: Author & File History**
- Joris Vaisvila is not a frequent kernel contributor (only 1-2 commits
  found)
- However, both reviewers are well-known in this subsystem
- File has 231 commits since 296c9120752b; 32 since v6.12

**Step 3.5: Dependencies**
- The patch is self-contained. The no-op ops pattern doesn't depend on
  any other patches.
- In v6.6, the `mtk_mac_finish` function also writes to `MTK_MAC_MCR`
  without MT7628 guard — same bug. The no-op ops approach fixes all
  callbacks at once.

## PHASE 4: MAILING LIST

Lore/b4 dig returned results but couldn't access full discussions due to
Anubis protection. The patch was submitted as
`20260226154547.68553-1-joey@tinyisr.com` and accepted by Jakub Kicinski
(netdev maintainer).

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1-5.4: Impact Surface**
- `mtk_mac_link_down` is called by phylink whenever the link goes down —
  every cable disconnect, PHY negotiation change
- `mtk_mac_link_up` is called on every link up event
- `mtk_mac_finish` is called during PHY configuration
- On MT7628, these are called regularly during normal operation
- `mtk_set_mcr_max_rx` at line 3886 already has its own `MTK_SOC_MT7628`
  guard, confirming the developers know these registers don't exist on
  MT7628

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1:** The buggy code exists in ALL stable trees from v5.3+,
including v5.15, v6.1, v6.6, and 6.12.
- In v6.6: `mtk_mac_link_down` at line 689 unconditionally writes to
  `MTK_MAC_MCR` — confirmed the same bug
- In v6.6: `mtk_mac_link_up` at line 769 also unconditionally writes to
  `MTK_MAC_MCR` — confirmed
- In v6.6: `mtk_mac_finish` at line 660 also writes to `MTK_MAC_MCR` —
  confirmed

**Step 6.2: Backport Difficulty**
For v7.0: Should apply cleanly or with minor fuzz.
For v6.6 and older: Will need rework. The `mtk_mac_link_down`/`link_up`
implementations differ significantly (v7.0 has xgmii handling added by
`51cf06ddafc91e`). However, the *concept* of the fix (separate no-op
ops) is portable.

## PHASE 7: SUBSYSTEM CONTEXT

- Subsystem: Network driver (embedded Ethernet), IMPORTANT criticality
  for MT7628 users
- MT7628/MT7688 is a widely-used MIPS SoC found in popular embedded
  platforms (Omega2, VoCore2, many OpenWrt routers)

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected Users**
- All MT7628/MT7688 users (embedded routers running Linux with VLANs)

**Step 8.2: Trigger Conditions**
- Triggered on every link state change (boot, cable plug/unplug, PHY
  state change)
- Extremely common — happens during normal boot

**Step 8.3: Failure Mode**
- **ESW VLAN configuration corruption** — MEDIUM-HIGH severity
- VLAN configuration is silently corrupted, leading to incorrect network
  behavior
- Not a crash but a data corruption issue affecting network
  configuration

**Step 8.4: Risk-Benefit**
- Benefit: HIGH — prevents VLAN corruption on every MT7628 system
- Risk: LOW — the fix adds empty callback functions and selects them
  conditionally; the no-op approach is obviously correct for a fixed-
  link PHY with no MAC control registers

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
1. Fixes real data corruption (VLAN config) on real hardware
   (MT7628/MT7688)
2. Bug present since v5.3 (2019) — affects all stable trees
3. Reviewed by original MT7628 author and subsystem expert
4. Accepted by netdev maintainer (Jakub Kicinski)
5. Fix is obviously correct (no-op callbacks for hardware without MAC
   registers)
6. Single file change, well-contained
7. Other code in the same file already has MT7628 guards for the same
   registers (`mtk_set_mcr_max_rx` at line 3886)

**Evidence AGAINST backporting:**
1. ~30 lines of new code (not trivially small, but straightforward)
2. May need rework for older stable trees (6.6, 6.1, 5.15) due to
   function refactoring
3. No explicit Cc: stable (expected for commits under review)

**Stable Rules Checklist:**
1. Obviously correct and tested? YES — reviewed by 2 experts, one the
   original author
2. Fixes a real bug? YES — VLAN config corruption on MT7628
3. Important issue? YES — data corruption, affects all MT7628 users
4. Small and contained? YES — single file, ~30 lines
5. No new features? CORRECT — only prevents invalid register writes
6. Can apply to stable? YES for 7.0; needs rework for older trees

**Verification:**
- [Phase 1] Parsed tags: Reviewed-by from Daniel Golle and Stefan Roese,
  Link to netdev submission
- [Phase 2] Diff analysis: Adds no-op phylink_mac_ops for MT7628,
  selects them in `mtk_add_mac()`, removes now-unreachable guard in
  `mtk_mac_config()`
- [Phase 3] git blame: buggy code from `b8fc9f30821ec0` (2019), phylink
  conversion missing MT7628 guards in link_down/link_up
- [Phase 3] git show 296c9120752b: confirmed original MT7628 support
  commit from v5.3 era (2019-08-16)
- [Phase 3] git tag --contains: original commit present in p-5.10,
  p-5.15 tags (all active stable trees)
- [Phase 5] Verified `mtk_mac_link_down` writes to
  `MTK_MAC_MCR(mac->id)` without MT7628 check (line 731 in current code)
- [Phase 5] Verified `mtk_mac_link_up`→`mtk_gdm_mac_link_up` writes to
  `MTK_MAC_MCR(mac->id)` (line 846) without MT7628 check
- [Phase 5] Verified `mtk_mac_finish` writes to `MTK_MAC_MCR(mac->id)`
  (line 709/716) without MT7628 check
- [Phase 5] Confirmed `mtk_set_mcr_max_rx` (line 3886) already guards
  against MT7628, proving developers know these registers don't exist on
  MT7628
- [Phase 6] Verified v6.6 stable has the same bug: `mtk_mac_link_down`
  (line 689) and `mtk_mac_link_up` (line 769) unconditionally write to
  `MTK_MAC_MCR`
- [Phase 6] `MTK_MAC_MCR(x)` = 0x10100 + x*0x100, confirmed in header
  file (line 453)
- [Phase 8] VLAN corruption confirmed by commit message: "Writes to
  these registers only corrupt the ESW VLAN configuration"
- UNVERIFIED: Could not access full lore.kernel.org discussion due to
  Anubis protection; relied on tags in the commit message

**YES**

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 34 ++++++++++++++++++---
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index ddc321a02fdae..bb8ced22ca3be 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -562,9 +562,7 @@ static void mtk_mac_config(struct phylink_config *config, unsigned int mode,
 	int val, ge_mode, err = 0;
 	u32 i;
 
-	/* MT76x8 has no hardware settings between for the MAC */
-	if (!MTK_HAS_CAPS(eth->soc->caps, MTK_SOC_MT7628) &&
-	    mac->interface != state->interface) {
+	if (mac->interface != state->interface) {
 		/* Setup soc pin functions */
 		switch (state->interface) {
 		case PHY_INTERFACE_MODE_TRGMII:
@@ -956,6 +954,30 @@ static const struct phylink_mac_ops mtk_phylink_ops = {
 	.mac_enable_tx_lpi = mtk_mac_enable_tx_lpi,
 };
 
+static void rt5350_mac_config(struct phylink_config *config, unsigned int mode,
+				const struct phylink_link_state *state)
+{
+}
+
+static void rt5350_mac_link_down(struct phylink_config *config, unsigned int mode,
+				phy_interface_t interface)
+{
+}
+
+static void rt5350_mac_link_up(struct phylink_config *config,
+			    struct phy_device *phy,
+			    unsigned int mode, phy_interface_t interface,
+			    int speed, int duplex, bool tx_pause, bool rx_pause)
+{
+}
+
+/* MT76x8 (rt5350-eth) does not expose any MAC control registers */
+static const struct phylink_mac_ops rt5350_phylink_ops = {
+	.mac_config = rt5350_mac_config,
+	.mac_link_down = rt5350_mac_link_down,
+	.mac_link_up = rt5350_mac_link_up,
+};
+
 static void mtk_mdio_config(struct mtk_eth *eth)
 {
 	u32 val;
@@ -4780,6 +4802,7 @@ static const struct net_device_ops mtk_netdev_ops = {
 
 static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np)
 {
+	const struct phylink_mac_ops *mac_ops = &mtk_phylink_ops;
 	const __be32 *_id = of_get_property(np, "reg", NULL);
 	phy_interface_t phy_mode;
 	struct phylink *phylink;
@@ -4914,9 +4937,12 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np)
 			  mac->phylink_config.supported_interfaces);
 	}
 
+	if (MTK_HAS_CAPS(eth->soc->caps, MTK_SOC_MT7628))
+		mac_ops = &rt5350_phylink_ops;
+
 	phylink = phylink_create(&mac->phylink_config,
 				 of_fwnode_handle(mac->of_node),
-				 phy_mode, &mtk_phylink_ops);
+				 phy_mode, mac_ops);
 	if (IS_ERR(phylink)) {
 		err = PTR_ERR(phylink);
 		goto free_netdev;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.6] wifi: mt76: mt7996: reset device after MCU message timeout
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (12 preceding siblings ...)
  2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628 Sasha Levin
@ 2026-04-20 13:20 ` Sasha Levin
  2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-5.10] drm/mediatek: mtk_dsi: enable hs clock during pre-enable Sasha Levin
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:20 UTC (permalink / raw)
  To: patches, stable
  Cc: Chad Monroe, Felix Fietkau, Sasha Levin, lorenzo, ryder.lee,
	matthias.bgg, angelogioacchino.delregno, linux-wireless,
	linux-kernel, linux-arm-kernel, linux-mediatek

From: Chad Monroe <chad@monroe.io>

[ Upstream commit d2b860454ea2df8f336e9b859da7ffb27f43444d ]

Trigger a full reset after MCU message timeout.

Signed-off-by: Chad Monroe <chad@monroe.io>
Link: https://patch.msgid.link/6e05ed063f3763ad3457633c56b60a728a49a6f0.1765203753.git.chad@monroe.io
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have all the information needed for the complete analysis. Let me
compile the findings.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `wifi: mt76: mt7996`
- Action verb: "reset" (implies recovery from a failure state)
- Summary: Trigger a device reset when MCU message timeouts occur on
  mt7996
- Record: [wifi/mt76/mt7996] [reset] [Trigger full device recovery after
  MCU message timeout]

**Step 1.2: Tags**
- `Signed-off-by: Chad Monroe <chad@monroe.io>` - patch author
- `Link: https://patch.msgid.link/...` - reference to lore
- `Signed-off-by: Felix Fietkau <nbd@nbd.name>` - maintainer/committer
  of mt76
- No Fixes: tag (expected for manual review candidates)
- No Cc: stable (expected)
- No Reported-by tag
- Record: Author is Chad Monroe; applied by Felix Fietkau (the mt76
  subsystem maintainer)

**Step 1.3: Commit Body**
- Body is very brief: "Trigger a full reset after MCU message timeout."
- No stack traces or reproduction steps given
- The mt7915 equivalent (commit 10f73bb3938f7c5) provides more context:
  "MCU hangs do not trigger watchdog interrupts, so they can only be
  detected through MCU message timeouts. Ensure that the hardware gets
  restarted when that happens in order to prevent a permanent stuck
  state."
- Record: Bug = MCU hang leaves device permanently stuck. Symptom = WiFi
  device becomes non-functional, requires reboot. Root cause = MCU hang
  without watchdog interrupt, only detectable via message timeout, no
  recovery triggered.

**Step 1.4: Hidden Bug Fix Detection**
- "reset device after MCU message timeout" - this is clearly a fix for a
  missing recovery path. Without it, a firmware hang results in a
  permanent stuck state.
- Record: This IS a bug fix. The device becomes permanently stuck
  without it.

---

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- `mcu.c`: +9 lines (in `mt7996_mcu_parse_response`)
- `mac.c`: +5 lines (in `mt7996_reset`)
- Total: ~14 lines added, 0 removed
- Scope: Single-driver, surgical fix in two closely-related functions
- Record: 2 files, +14 lines, functions: mt7996_mcu_parse_response,
  mt7996_reset

**Step 2.2: Code Flow Changes**

Hunk 1 (mcu.c): In `mt7996_mcu_parse_response()`, when `skb == NULL`
(MCU timeout):
- **Before**: Log error, return -ETIMEDOUT. No recovery action.
- **After**: Log error, atomically set `MT76_MCU_RESET` bit (via
  `test_and_set_bit` to prevent duplicates), set `recovery.restart =
  true`, wake up MCU wait queue, queue `reset_work`, wake up
  `reset_wait`, then return -ETIMEDOUT.

Hunk 2 (mac.c): In `mt7996_reset()`, before the existing `queue_work`:
- **Before**: Always queue reset_work and wake reset_wait
  unconditionally.
- **After**: If `MT_MCU_CMD_STOP_DMA` is set, additionally set
  `MT76_MCU_RESET` bit and wake up MCU wait queue, aborting pending MCU
  operations before reset.

**Step 2.3: Bug Mechanism**
- Category: Missing error recovery / permanent hardware stuck state
- The MCU can hang in a way that doesn't trigger a hardware watchdog
  interrupt. The only indication is MCU message timeouts. Without this
  patch, timeouts just return an error code but never trigger device
  recovery. The device becomes permanently non-functional.
- Record: Missing recovery mechanism. MCU hang → timeout → error return
  → no recovery → permanent stuck state.

**Step 2.4: Fix Quality**
- Obviously correct: Mirrors the exact same pattern used in mt7915
  (commit 10f73bb3938f7c5) and mt7915's STOP_DMA handling (commit
  b13cd593ef2402).
- Minimal/surgical: Only adds recovery trigger code at the exact points
  needed.
- `test_and_set_bit` prevents duplicate resets.
- Regression risk: Very low. The reset_work handler already handles
  `recovery.restart = true` properly. The STOP_DMA path already exists
  for other triggers.
- Record: High quality fix, obviously correct, mirrors established
  patterns.

---

## PHASE 3: GIT HISTORY INVESTIGATION

**Step 3.1: Blame**
- `mt7996_mcu_parse_response()`: Unchanged since original driver
  addition by Shayne Chen (commit 98686cd21624c7, November 2022, v6.2).
- `mt7996_reset()`: Added by Bo Jiao (commit 27015b6fbcca83, April 2023,
  v6.4) as "enable full system reset support".
- The buggy code (missing recovery trigger) has been present since the
  driver was first created.
- Record: Bug present since v6.2 (mcu.c) and v6.4 (mac.c had
  mt7996_reset without STOP_DMA handling).

**Step 3.2: No Fixes: Tag**
- N/A - no Fixes: tag present (expected).

**Step 3.3: File History**
- The mt7996 reset infrastructure was significantly improved in v6.18
  (ace5d3b6b49e8 "improve hardware restart reliability"). However, the
  basic recovery mechanism has been in place since v6.4.
- The commit `beb01caa570c52` in v6.18 decreased MCU timeouts to allow
  faster recovery - this patch's logic works with either timeout value.
- Record: This commit is standalone; no prerequisites needed beyond the
  v6.4 reset infrastructure.

**Step 3.4: Author**
- Chad Monroe is a contributor to mt76 (5 commits found in the driver).
- Felix Fietkau (nbd@nbd.name) is THE mt76 subsystem maintainer - he
  applied the patch.
- Felix also authored the identical fix for mt7915 (10f73bb3938f7c5).
- Record: Applied by subsystem maintainer. Author is a regular
  contributor.

**Step 3.5: Dependencies**
- All structures/flags used already exist: `MT76_MCU_RESET`,
  `recovery.restart`, `mcu.wait`, `reset_work`, `reset_wait`,
  `MT_MCU_CMD_STOP_DMA`.
- No new functions or data structures introduced.
- Record: Fully self-contained, no dependencies on other uncommitted
  patches.

---

## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH

**Step 4.1: Original Discussion**
- lore.kernel.org returned anti-bot protection; direct web access was
  blocked.
- b4 dig of the mt7915 equivalent found it was part of a 24-patch series
  (v2) by Felix Fietkau from August 2024.
- The mt7996 version is by Chad Monroe and was ported from the mt7915
  fix.
- Record: Could not access lore directly due to anti-bot protection. b4
  confirmed the mt7915 version was part of Felix Fietkau's cleanup
  series.

**Step 4.2: Reviewer**
- Applied by Felix Fietkau, the mt76 subsystem maintainer.
- Record: Subsystem maintainer applied the patch directly.

**Step 4.3-4.5**: Blocked by lore anti-bot protection. No additional
information could be gathered.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Functions Modified**
- `mt7996_mcu_parse_response()` - MCU response parser (callback)
- `mt7996_reset()` - device reset entry point

**Step 5.2: Callers of `mt7996_mcu_parse_response`**
- Registered as `.mcu_parse_response` in `mt7996_mcu_ops` (mcu.c line
  3363).
- Called from `mt76_mcu_skb_send_and_get_msg()` in `mcu.c` (core mt76
  code, line 122).
- This is the universal MCU message response handler - called for EVERY
  MCU command the driver issues.
- Record: Called for every MCU message. Critical, high-frequency path.

**Step 5.3: Callers of `mt7996_reset`**
- Called from interrupt context and error recovery paths.
- Used by `mt7996_irq_tasklet()` when MCU command interrupts occur.
- Record: Called from interrupt handler / tasklet context.

**Step 5.4: Call Chain**
- Any WiFi operation → MCU command → `mt76_mcu_skb_send_and_get_msg()` →
  wait for response → `mt7996_mcu_parse_response()` → if timeout →
  trigger reset
- This path is reachable from normal WiFi operations (scan, associate,
  channel switch, etc.)
- Record: Fully reachable from normal user operations.

**Step 5.5: Similar Patterns**
- mt7915 has identical recovery logic (10f73bb3938f7c5 +
  b13cd593ef2402).
- mt7921/mt7925 have similar reset mechanisms.
- Record: Well-established pattern across the mt76 driver family.

---

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Buggy Code in Stable Trees**
- mt7996 driver added in v6.2.
- `mt7996_reset()` added in v6.4.
- `mt7996_mcu_parse_response()` unchanged since v6.2.
- The mcu.c part of the fix applies to 6.2+. The mac.c part applies to
  6.4+.
- Affected stable trees: 6.6.y, 6.12.y, and any other active LTS that
  includes mt7996.
- Record: Bug exists in 6.6.y and all later stable trees.

**Step 6.2: Backport Complications**
- The code being modified is unchanged since original introduction.
- Should apply cleanly to 6.6.y.
- Record: Expected clean apply.

**Step 6.3: No Related Fixes Already in Stable**
- No similar fix found in stable trees.
- Record: No existing fix for this issue in stable.

---

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1: Subsystem**
- WiFi driver (drivers/net/wireless/mediatek/mt76/mt7996)
- MT7996 is MediaTek's Wi-Fi 7 (802.11be) chipset - used in routers and
  access points.
- Criticality: IMPORTANT - WiFi is critical infrastructure for many
  users.
- Record: [WiFi driver] [IMPORTANT - affects mt7996 hardware users]

**Step 7.2: Activity**
- Very actively developed - dozens of commits in recent releases.
- Active MLO/Wi-Fi 7 development ongoing.
- Record: Highly active subsystem.

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

**Step 8.1: Affected Users**
- All users of mt7996/mt7992 WiFi hardware (routers, access points, PCIe
  WiFi cards).
- Record: Hardware-specific, but MT7996 is a current-generation popular
  WiFi chipset.

**Step 8.2: Trigger Conditions**
- Triggers when MCU firmware hangs without issuing a watchdog interrupt.
- Can happen during normal operation (firmware bugs, hardware glitches).
- Not user-triggered in the security sense, but can happen during
  routine WiFi operation.
- Record: Firmware hang during normal operation. Not predictable but
  happens in practice (same fix was needed for mt7915).

**Step 8.3: Failure Mode**
- Without the fix: WiFi device becomes permanently non-functional until
  reboot.
- This is a system hang from the WiFi perspective.
- Severity: HIGH (permanent loss of WiFi connectivity, requires reboot)
- Record: Permanent device stuck state. Severity: HIGH.

**Step 8.4: Risk-Benefit Ratio**
- Benefit: HIGH - prevents permanent WiFi device failure, enables
  automatic recovery.
- Risk: VERY LOW - 14 lines, uses `test_and_set_bit` for safety, mirrors
  proven mt7915 pattern, no changes to public APIs or data structures.
- Record: Benefit HIGH, Risk VERY LOW. Excellent ratio.

---

## PHASE 9: FINAL SYNTHESIS

**Step 9.1: Evidence Summary**

FOR backporting:
- Fixes permanent device stuck state (WiFi becomes completely non-
  functional)
- Small and surgical fix (14 lines, 2 files in same driver)
- Mirrors identical fix already applied to sister chip mt7915 (proven
  pattern)
- Applied by mt76 subsystem maintainer (Felix Fietkau)
- No dependencies on other patches
- Buggy code present since driver introduction (v6.2/v6.4)
- `test_and_set_bit` prevents duplicate resets (safe concurrent
  behavior)
- Should apply cleanly to stable trees

AGAINST backporting:
- Brief commit message (but supplemented by identical mt7915 fix
  explanation)
- No explicit stable nomination or Reported-by (but this is expected for
  manual review candidates)

**Step 9.2: Stable Rules Checklist**
1. Obviously correct and tested? **YES** - mirrors proven mt7915
   pattern, applied by maintainer
2. Fixes a real bug? **YES** - permanent device stuck state
3. Important issue? **YES** - device becomes permanently non-functional
4. Small and contained? **YES** - 14 lines in single driver
5. No new features or APIs? **YES** - only adds error recovery
6. Can apply to stable? **YES** - code unchanged since introduction

**Step 9.3: Exception Categories**
- Not an exception category - this is a standard bug fix.

**Step 9.4: Decision**
This is a clear bug fix that prevents a permanent device stuck state.
It's small, obviously correct, follows established patterns, and was
applied by the subsystem maintainer. The benefit-to-risk ratio is
excellent.

---

## Verification

- [Phase 1] Parsed tags: Link to patch.msgid, signed by author (Chad
  Monroe) and maintainer (Felix Fietkau)
- [Phase 2] Diff analysis: +9 lines in mcu.c (reset trigger on timeout),
  +5 lines in mac.c (STOP_DMA handling)
- [Phase 3] git blame: mt7996_mcu_parse_response unchanged since
  98686cd21624c7 (v6.2); mt7996_reset from 27015b6fbcca83 (v6.4)
- [Phase 3] git show 10f73bb3938f7c5: confirmed identical mt7915 fix
  exists with clear explanation of the permanent stuck state bug
- [Phase 3] git show b13cd593ef2402: confirmed identical mt7915 STOP_DMA
  handling pattern
- [Phase 3] git log --author: Chad Monroe has 5 commits in mt76, is a
  regular contributor
- [Phase 4] b4 dig: found mt7915 version in Felix Fietkau's series at
  lore
- [Phase 4] lore direct access blocked by anti-bot protection
- [Phase 5] Grep MT76_MCU_RESET: confirmed flag is used across 15+ files
  in mt76 for reset coordination
- [Phase 5] Grep mt7996_mcu_parse_response: confirmed it's registered as
  mcu_ops callback (line 3363)
- [Phase 5] Grep reset_work: confirmed reset_work/reset_wait
  infrastructure exists since v6.4
- [Phase 6] git tag --contains: mt7996_reset present since v6.4; driver
  since v6.2
- [Phase 6] Current 7.0 code confirmed: mt7996_reset lacks STOP_DMA
  handling, parse_response lacks reset trigger
- [Phase 8] Failure mode: permanent WiFi device stuck state requiring
  reboot - severity HIGH
- UNVERIFIED: Exact lore discussion content could not be accessed due to
  anti-bot protection

**YES**

 drivers/net/wireless/mediatek/mt76/mt7996/mac.c | 5 +++++
 drivers/net/wireless/mediatek/mt76/mt7996/mcu.c | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
index d4f3ee943b472..b7aa51481ce82 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
@@ -2737,6 +2737,11 @@ void mt7996_reset(struct mt7996_dev *dev)
 		return;
 	}
 
+	if (READ_ONCE(dev->recovery.state) & MT_MCU_CMD_STOP_DMA) {
+		set_bit(MT76_MCU_RESET, &dev->mphy.state);
+		wake_up(&dev->mt76.mcu.wait);
+	}
+
 	queue_work(dev->mt76.wq, &dev->reset_work);
 	wake_up(&dev->reset_wait);
 }
diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
index c0c042de477b8..54776f0703876 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
@@ -209,6 +209,7 @@ static int
 mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
 			  struct sk_buff *skb, int seq)
 {
+	struct mt7996_dev *dev = container_of(mdev, struct mt7996_dev, mt76);
 	struct mt7996_mcu_rxd *rxd;
 	struct mt7996_mcu_uni_event *event;
 	int mcu_cmd = FIELD_GET(__MCU_CMD_FIELD_ID, cmd);
@@ -217,6 +218,14 @@ mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
 	if (!skb) {
 		dev_err(mdev->dev, "Message %08x (seq %d) timeout\n",
 			cmd, seq);
+
+		if (!test_and_set_bit(MT76_MCU_RESET, &dev->mphy.state)) {
+			dev->recovery.restart = true;
+			wake_up(&dev->mt76.mcu.wait);
+			queue_work(dev->mt76.wq, &dev->reset_work);
+			wake_up(&dev->reset_wait);
+		}
+
 		return -ETIMEDOUT;
 	}
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-5.10] drm/mediatek: mtk_dsi: enable hs clock during pre-enable
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (13 preceding siblings ...)
  2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: mt7996: reset device after MCU message timeout Sasha Levin
@ 2026-04-20 13:21 ` Sasha Levin
  2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix frequency separation for station STR mode Sasha Levin
  2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process() Sasha Levin
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:21 UTC (permalink / raw)
  To: patches, stable
  Cc: Gary Bisson, CK Hu, Chen-Yu Tsai, AngeloGioacchino Del Regno,
	Chun-Kuang Hu, Sasha Levin, p.zabel, airlied, simona,
	matthias.bgg, dri-devel, linux-mediatek, linux-kernel,
	linux-arm-kernel

From: Gary Bisson <bisson.gary@gmail.com>

[ Upstream commit 76255024cadbe8c40462953f8193d2d78cd3b0ac ]

Some bridges, such as the TI SN65DSI83, require the HS clock to be
running in order to lock its PLL during its own pre-enable function.

Without this change, the bridge gives the following error:
sn65dsi83 14-002c: failed to lock PLL, ret=-110
sn65dsi83 14-002c: Unexpected link status 0x01
sn65dsi83 14-002c: reset the pipe

Move the necessary functions from enable to pre-enable.

Signed-off-by: Gary Bisson <bisson.gary@gmail.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org> # Chromebooks
Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20260120-mtkdsi-v1-1-b0f4094f3ac3@gmail.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Good. Now I have all the information needed for a comprehensive
analysis.

---

## Complete Analysis

### PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `drm/mediatek: mtk_dsi`
- Action: "enable" (HS clock during pre-enable)
- Summary: Move HS clock activation from the bridge enable phase to pre-
  enable so downstream bridges can use it.
Record: [drm/mediatek] [enable] [Move HS clock setup earlier to fix
bridge PLL locking]

**Step 1.2: Tags**
- Signed-off-by: Gary Bisson (author)
- Reviewed-by: CK Hu (MediaTek DSI maintainer)
- Tested-by: Chen-Yu Tsai (Chromium developer) on 5 Chromebook variants
- Tested-by: AngeloGioacchino Del Regno (Collabora, regular MTK
  contributor)
- Reviewed-by: AngeloGioacchino Del Regno
- Link: to patchwork entry
- Signed-off-by: Chun-Kuang Hu (subsystem maintainer, merged it)

Record: Strong review and testing from 3 independent testers across 7+
hardware platforms. Maintainer reviewed and approved.

**Step 1.3: Commit Body**
- Bug: SN65DSI83 bridge chip fails to lock its PLL during pre-enable,
  resulting in errors: "failed to lock PLL, ret=-110" / "Unexpected link
  status 0x01" / "reset the pipe"
- Root cause: HS clock not running during the pre-enable phase
- Fix: Move `mtk_dsi_lane_ready()` and `mtk_dsi_clk_hs_mode(dsi, 1)`
  from enable to pre-enable (poweron)
Record: Clear bug description with error messages. Display completely
fails without fix.

**Step 1.4: Hidden Bug Fix?**
This is NOT hidden - it's explicitly a fix for display not working with
certain DSI bridges.

### PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- 1 file: `drivers/gpu/drm/mediatek/mtk_dsi.c`
- +17 / -18 lines (net -1 line)
- Functions modified: `mtk_dsi_lane_ready()` (moved earlier),
  `mtk_dsi_poweron()` (added 2 calls), `mtk_output_dsi_enable()`
  (removed 3 lines)
Record: Single-file surgical fix, minimal scope.

**Step 2.2: Code Flow Change**
- `mtk_dsi_lane_ready()` function definition moved earlier (before
  `mtk_dsi_poweron`) - this is purely for forward declaration ordering
- In `mtk_dsi_poweron()` (called during bridge pre_enable): added
  `mtk_dsi_lane_ready(dsi)` and `mtk_dsi_clk_hs_mode(dsi, 1)` at end
- In `mtk_output_dsi_enable()` (called during bridge enable): removed
  `mtk_dsi_lane_ready(dsi)` and `mtk_dsi_clk_hs_mode(dsi, 1)`, kept
  `mtk_dsi_set_mode(dsi)` and `mtk_dsi_start(dsi)`

Before: Lane ready + HS clock in enable phase
After: Lane ready + HS clock in pre-enable phase

**Step 2.3: Bug Mechanism**
Category: Hardware interoperability / timing issue. The SN65DSI83 bridge
requires HS clock from the DSI host during its pre_enable to lock its
PLL. Without HS clock, the bridge fails completely.

**Step 2.4: Fix Quality**
- Obviously correct: just moves existing function calls earlier in the
  init sequence
- Minimal: no new logic, no new code paths
- Regression risk is LOW: extensively tested on 7+ platforms with
  different bridges/panels, all confirmed no regressions

### PHASE 3: GIT HISTORY

**Step 3.1: Blame**
- `mtk_dsi_lane_ready()` introduced by commit `39e8d062b03c3d` (Jitao
  Shi, 2022-05-20) - present since ~v5.19
- `mtk_dsi_clk_hs_mode(dsi, 1)` in enable path introduced by
  `80a5cfd60d2a94` (yt.shen@mediatek.com, 2017-03-31) - present since
  v4.x
- The buggy ordering has existed since 2022 when lane_ready was moved to
  enable
Record: Bug present in all active stable trees (v5.19+)

**Step 3.2: No Fixes: tag** (expected for autosel candidate)

**Step 3.3: File History**
- Recent changes to mtk_dsi.c include bridge API updates
  (devm_drm_bridge_alloc, encoder parameter), HS mode support, pre-
  enable order fix/revert
- The pre-enable order fix/revert (f5b1819193667 / 33e8150bd32d7) is
  related but independent - it was about `pre_enable_prev_first` flag
  management

**Step 3.4: Author**
- Gary Bisson is a regular contributor to MediaTek platforms (Tungsten
  boards), actively maintains DT and driver support

**Step 3.5: Dependencies**
- No dependencies. The commit 8b00951402f74 (HS mode in cmdq) is
  completely independent
- The SN65DSI83 driver already sets `pre_enable_prev_first = true`,
  ensuring correct bridge ordering

### PHASE 4: MAILING LIST DISCUSSION

**Step 4.1: Original Discussion**
- b4 mbox retrieved 5 messages in the thread
- CK Hu (MediaTek DSI maintainer) noted "this changes the flow for all
  SoC and panel, so I would wait for more SoC and more panel test" -
  then gave Reviewed-by after testing completed
- AngeloGioacchino Del Regno tested on MT6795 + MT8395, gave both
  Tested-by and Reviewed-by
- Chen-Yu Tsai tested on 5 Chromebook models (MT8173, MT8183x2,
  MT8186x2) - "No regressions observed"
- Chun-Kuang Hu applied it with message "Applied to mediatek-drm-next"

**Step 4.2: Reviewers**
All appropriate MediaTek subsystem maintainers were CC'd and reviewed.
CK Hu explicitly asked for extensive testing, which was provided.

### PHASE 5: CODE SEMANTIC ANALYSIS

**Step 5.1: Functions Modified**
- `mtk_dsi_poweron()`: called from `mtk_dsi_bridge_atomic_pre_enable()`
  and `mtk_dsi_ddp_start()`
- `mtk_output_dsi_enable()`: called from
  `mtk_dsi_bridge_atomic_enable()`
- `mtk_dsi_lane_ready()`: also called from `mtk_dsi_host_transfer()`
  (for DSI command transfers)

**Step 5.2: Impact on mtk_dsi_host_transfer**
After the patch, `mtk_dsi_lane_ready()` call in
`mtk_dsi_host_transfer()` becomes a no-op during normal operation (lanes
already ready from poweron). This is safe because DSI must be powered on
before any host transfers.

**Step 5.3: Bridge ordering confirmed**
The SN65DSI83 bridge driver sets `ctx->bridge.pre_enable_prev_first =
true` (line 1041 of `ti-sn65dsi83.c`), which causes
`drm_atomic_bridge_chain_pre_enable()` to call the MTK DSI pre_enable
BEFORE the SN65DSI83's pre_enable. This confirms the fix works
correctly.

### PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code exists in stable trees**
The buggy code (`mtk_dsi_lane_ready()` being called from enable instead
of pre_enable) has been present since v5.19, so it affects all active
stable trees from 6.1 onwards.

**Step 6.2: Backport difficulty**
The patch should apply cleanly - the context in `mtk_dsi_poweron()` and
`mtk_output_dsi_enable()` is unchanged in the 7.0 tree.

### PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1: Subsystem**
- drm/mediatek (DRM display driver) - IMPORTANT category
- MediaTek SoCs are used extensively in Chromebooks, Android devices,
  and embedded systems

### PHASE 8: IMPACT AND RISK

**Step 8.1: Affected users**
Users of MediaTek SoCs with DSI bridges that need HS clock during
initialization (specifically SN65DSI83, potentially others).

**Step 8.2: Trigger conditions**
Every display initialization when using SN65DSI83 with MediaTek DSI. The
display simply doesn't work.

**Step 8.3: Failure mode**
Without fix: Display completely fails to initialize (PLL lock fails,
bridge errors, no display output). Severity: HIGH - affects usability.

**Step 8.4: Risk-Benefit**
- Benefit: HIGH - enables display to work for specific bridge
  combinations, no regressions observed on 7+ platforms
- Risk: LOW - moves existing function calls earlier, extensively tested,
  well-reviewed, single file

### PHASE 9: SYNTHESIS

**Evidence FOR backporting:**
1. Fixes a real hardware bug: display doesn't work with SN65DSI83 bridge
2. Small, surgical change (17 insertions, 18 deletions, net -1 line,
   single file)
3. Extensive testing: 7+ platforms (Chromebooks, MediaTek boards) - zero
   regressions
4. Strong review: MediaTek DSI maintainer, Collabora engineer, Chromium
   developer
5. Code is well-understood: simply moves existing calls from enable to
   pre-enable
6. Affects popular hardware (Chromebooks use MediaTek SoCs extensively)

**Evidence AGAINST backporting:**
1. CK Hu noted it "changes the flow for all SoC and panel" - but this
   concern was addressed by extensive testing
2. No explicit Cc: stable tag - but this is expected for autosel
   candidates

**Stable rules checklist:**
1. Obviously correct and tested? YES - extensively tested on 7+
   platforms
2. Fixes a real bug? YES - display completely fails with SN65DSI83
   bridge
3. Important issue? YES - hardware doesn't work (no display output)
4. Small and contained? YES - single file, 35 lines total changed
5. No new features? YES - just moves existing code to different phase
6. Can apply to stable? YES - context appears unchanged

---

### Verification

- [Phase 1] Parsed tags: Reviewed-by (CK Hu, AngeloGioacchino), Tested-
  by (Chen-Yu Tsai on 5 Chromebooks, AngeloGioacchino on 2 platforms)
- [Phase 2] Diff analysis: mtk_dsi_lane_ready + clk_hs_mode moved from
  enable to poweron (pre_enable), 17 add / 18 del
- [Phase 3] git blame: lane_ready introduced in commit 39e8d062b03c3d
  (2022, v5.19), clk_hs_mode in enable since 80a5cfd60d2a94 (2017)
- [Phase 3] git show 33e8150bd32d7: revert of pre_enable_prev_first in
  MediaTek DSI is independent
- [Phase 3] git show 8b00951402f74: HS mode support in cmdq is
  independent, no dependency
- [Phase 4] b4 mbox retrieved thread: CK Hu reviewed, requested more
  testing, got it; patch applied to mediatek-drm-next
- [Phase 4] Chen-Yu Tsai tested on MT8173, MT8183, MT8186 Chromebooks -
  no regressions
- [Phase 4] AngeloGioacchino tested on MT6795, MT8395 - no regressions
- [Phase 5] Confirmed SN65DSI83 sets pre_enable_prev_first=true (line
  1041), so MTK DSI pre_enable runs before SN65DSI83 pre_enable
- [Phase 5] mtk_dsi_host_transfer's call to lane_ready() becomes no-op
  (safe)
- [Phase 6] Buggy code exists in all stable trees since v5.19
- [Phase 8] Failure mode: display doesn't work at all (PLL lock fails),
  severity HIGH

The fix is small, surgical, well-tested across multiple platforms, and
fixes a real hardware bug where display completely fails with SN65DSI83
bridges on MediaTek DSI.

**YES**

 drivers/gpu/drm/mediatek/mtk_dsi.c | 35 +++++++++++++++---------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c
index aaf6c9ebd319f..5aa71fcdcfab7 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -673,6 +673,21 @@ static s32 mtk_dsi_switch_to_cmd_mode(struct mtk_dsi *dsi, u8 irq_flag, u32 t)
 	}
 }
 
+static void mtk_dsi_lane_ready(struct mtk_dsi *dsi)
+{
+	if (!dsi->lanes_ready) {
+		dsi->lanes_ready = true;
+		mtk_dsi_rxtx_control(dsi);
+		usleep_range(30, 100);
+		mtk_dsi_reset_dphy(dsi);
+		mtk_dsi_clk_ulp_mode_leave(dsi);
+		mtk_dsi_lane0_ulp_mode_leave(dsi);
+		mtk_dsi_clk_hs_mode(dsi, 0);
+		usleep_range(1000, 3000);
+		/* The reaction time after pulling up the mipi signal for dsi_rx */
+	}
+}
+
 static int mtk_dsi_poweron(struct mtk_dsi *dsi)
 {
 	struct device *dev = dsi->host.dev;
@@ -725,6 +740,8 @@ static int mtk_dsi_poweron(struct mtk_dsi *dsi)
 	mtk_dsi_set_vm_cmd(dsi);
 	mtk_dsi_config_vdo_timing(dsi);
 	mtk_dsi_set_interrupt_enable(dsi);
+	mtk_dsi_lane_ready(dsi);
+	mtk_dsi_clk_hs_mode(dsi, 1);
 
 	return 0;
 err_disable_engine_clk:
@@ -770,30 +787,12 @@ static void mtk_dsi_poweroff(struct mtk_dsi *dsi)
 	dsi->lanes_ready = false;
 }
 
-static void mtk_dsi_lane_ready(struct mtk_dsi *dsi)
-{
-	if (!dsi->lanes_ready) {
-		dsi->lanes_ready = true;
-		mtk_dsi_rxtx_control(dsi);
-		usleep_range(30, 100);
-		mtk_dsi_reset_dphy(dsi);
-		mtk_dsi_clk_ulp_mode_leave(dsi);
-		mtk_dsi_lane0_ulp_mode_leave(dsi);
-		mtk_dsi_clk_hs_mode(dsi, 0);
-		usleep_range(1000, 3000);
-		/* The reaction time after pulling up the mipi signal for dsi_rx */
-	}
-}
-
 static void mtk_output_dsi_enable(struct mtk_dsi *dsi)
 {
 	if (dsi->enabled)
 		return;
 
-	mtk_dsi_lane_ready(dsi);
 	mtk_dsi_set_mode(dsi);
-	mtk_dsi_clk_hs_mode(dsi, 1);
-
 	mtk_dsi_start(dsi);
 
 	dsi->enabled = true;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix frequency separation for station STR mode
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (14 preceding siblings ...)
  2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-5.10] drm/mediatek: mtk_dsi: enable hs clock during pre-enable Sasha Levin
@ 2026-04-20 13:21 ` Sasha Levin
  2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process() Sasha Levin
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:21 UTC (permalink / raw)
  To: patches, stable
  Cc: Peter Chiu, Lorenzo Bianconi, Felix Fietkau, Sasha Levin,
	ryder.lee, matthias.bgg, angelogioacchino.delregno,
	linux-wireless, linux-kernel, linux-arm-kernel, linux-mediatek

From: Peter Chiu <chui-hao.chiu@mediatek.com>

[ Upstream commit 59a295335021f6973a34566554b2b9371f1c6f7d ]

Fix frequency separation field for STR in MLD capabilities to get the
correct chip capability.

Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260316-mt7996-sta-str-v1-1-666814e6ab2d@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have enough information to complete the full analysis.

---

## PHASE 1: COMMIT MESSAGE FORENSICS

### Step 1.1: Subject Line
- **Subsystem**: `wifi: mt76: mt7996:` — MediaTek mt7996 WiFi 7 wireless
  driver
- **Action verb**: "fix" — explicitly a bug fix
- **Summary**: Corrects frequency separation field for station STR mode
  in MLD capabilities

### Step 1.2: Tags
- `Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>` — author from
  MediaTek (the chip vendor)
- `Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>` — co-
  author/maintainer of mt76
- `Link:` to patch.msgid.link — mailing list submission link
- `Signed-off-by: Felix Fietkau <nbd@nbd.name>` — mt76 maintainer
  applied the patch
- No Fixes: tag, no Cc: stable, no Reported-by — expected for candidate
  review

### Step 1.3: Commit Body
The message says: "Fix frequency separation field for STR in MLD
capabilities to get the correct chip capability." The bug is that the
station's MLD capabilities struct was missing the `FREQ_SEP_TYPE_IND`
field, causing the driver to advertise incorrect STR capabilities during
(Re)Association Request frames.

### Step 1.4: Hidden Bug Fix?
This is an explicit fix — the word "fix" is in the subject. The missing
capability field causes incorrect WiFi frame content to be advertised to
the AP during MLD association.

Record: Genuine bug fix — incorrect WiFi capability advertisement.

---

## PHASE 2: DIFF ANALYSIS

### Step 2.1: Inventory
- **1 file** changed: `drivers/net/wireless/mediatek/mt76/mt7996/init.c`
- **1 line added**:
  `FIELD_PREP_CONST(IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND, 1) |`
- Scope: Single-file, static initializer change

### Step 2.2: Code Flow
- **Before**: Station iftype entry in `iftypes_ext_capa[]` only sets
  `IEEE80211_MLD_CAP_OP_MAX_SIMUL_LINKS`
- **After**: Station entry additionally sets
  `IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND` to value 1
- The `mld_capa_and_ops` field is consumed by mac80211 in `mlme.c`
  (lines 2069 and 10612-10613) and included directly in MLD capability
  elements of association frames

### Step 2.3: Bug Mechanism
This is a **logic/correctness fix** — a missing capability field in a
static const initializer. `IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND` (mask
`0x0f80`, bits 7-11) was not set, meaning the station reported frequency
separation type = 0 to the AP, which does not reflect the actual mt7996
hardware STR capability.

### Step 2.4: Fix Quality
- Trivially correct — adding a missing field to a bitfield OR expression
- Cannot introduce a regression — it's a static const initializer
- Minimal and surgical
- No red flags

Record: 1 line added, static initializer, zero regression risk.

---

## PHASE 3: GIT HISTORY INVESTIGATION

### Step 3.1: Blame
The station entry was introduced by commit `f5160304d57c55` ("wifi:
mt76: mt7996: Enable MLO support for client interfaces") by Lorenzo
Bianconi on 2025-09-01. The base `iftypes_ext_capa[]` array was
introduced by `a9eae65d97f3cb` ("Export MLO AP capabilities to
mac80211") on 2025-08-27.

### Step 3.2: Fixes Target
No explicit Fixes: tag, but the implicit target is `f5160304d57c55`
which first introduced the station MLD capabilities. That commit was
first present in **v6.18** (confirmed via `git merge-base --is-
ancestor`). NOT in v6.17 or earlier.

### Step 3.3: File History
The file has 20 recent commits, many related to MLO/MLD enablement. This
fix appears standalone — no series dependencies.

### Step 3.4: Author
- Peter Chiu is from MediaTek (the silicon vendor) — strong authority on
  hardware capabilities
- Lorenzo Bianconi is the mt76 co-maintainer who also signed off
- Felix Fietkau (nbd) is the mt76 maintainer who applied the patch

Record: Author is from the chip vendor. Both mt76 maintainers signed
off. Bug introduced in v6.18.

### Step 3.5: Dependencies
The fix only adds a line to an existing OR expression in a static
initializer. No functional dependencies. The prerequisite code
(`iftypes_ext_capa` with the station entry) exists in v6.18 and v7.0.

---

## PHASE 4: MAILING LIST

### Step 4.1-4.5
Lore/WebFetch was blocked by Anubis protection. `b4 dig` for the
message-id didn't find a match by commit hash (the commit hasn't been
applied to the tree I'm on). The link
`https://patch.msgid.link/20260316-mt7996-sta-
str-v1-1-666814e6ab2d@kernel.org` indicates this was a v1 single-patch
submission from Lorenzo Bianconi.

Record: Could not fetch discussion. Patch submitted as v1, single patch.

---

## PHASE 5: CODE SEMANTIC ANALYSIS

### Step 5.1-5.4
The `iftypes_ext_capa[]` array is assigned to `wiphy->iftype_ext_capab`
at line 497 of `init.c`. This is consumed by mac80211's `mlme.c` via
`cfg80211_get_iftype_ext_capa()` — the `mld_capa_and_ops` field is
directly encoded into (Re)Association Request frames and MLD
reconfiguration frames. This is a hot path for any MLD station
association.

### Step 5.5: Similar Patterns
The AP section of the same array does NOT include `FREQ_SEP_TYPE_IND`
either, but only the station section is fixed here (STR is a station-
side mode). The mt7925 driver also sets `mld_capa_and_ops` but
dynamically.

Record: Capability is directly embedded in WiFi management frames during
association.

---

## PHASE 6: STABLE TREE ANALYSIS

### Step 6.1: Does the buggy code exist in stable trees?
- The buggy commit `f5160304d57c55` is in **v6.18 and v7.0 only**
- NOT in v6.17 or earlier — so this fix is irrelevant for all current
  LTS trees (6.12.y, 6.6.y, 6.1.y, 5.15.y)
- Only relevant for **7.0.y** stable and potentially 6.18.y if that is
  still maintained

### Step 6.2: Backport Complications
- 1 line addition to a static initializer — will apply cleanly to 7.0.y
- No conflicting changes expected

### Step 6.3: No related fixes already in stable

---

## PHASE 7: SUBSYSTEM CONTEXT

### Step 7.1
- **Subsystem**: WiFi drivers / MediaTek mt76 / mt7996
- **Criticality**: PERIPHERAL (specific WiFi hardware) but WiFi is
  important for many users

### Step 7.2
- Very active subsystem — 188 mt7996 commits since v6.12
- mt7996 is a WiFi 7 chip with active MLO development

---

## PHASE 8: IMPACT AND RISK ASSESSMENT

### Step 8.1: Affected Users
- Mt7996 WiFi 7 hardware users attempting MLD (Multi-Link Device)
  operation in station mode

### Step 8.2: Trigger Conditions
- Triggered whenever a mt7996 station performs MLD association (common
  for WiFi 7 users)
- The wrong capability is always advertised

### Step 8.3: Failure Mode
- Incorrect WiFi capability in association frames → potentially
  incorrect STR mode negotiation, possible performance degradation or
  failed MLD operation
- Severity: **MEDIUM** — functional correctness issue, not a crash or
  security vulnerability

### Step 8.4: Risk-Benefit
- **Benefit**: Medium — corrects WiFi MLD capability for mt7996 users
- **Risk**: Very low — 1-line static initializer change, cannot regress
- **Ratio**: Favorable

---

## PHASE 9: FINAL SYNTHESIS

### Step 9.1: Evidence
**FOR backporting:**
- Genuine correctness fix (missing capability field)
- Trivially small (1 line) and obviously correct
- Zero regression risk (static const initializer)
- Written by chip vendor (MediaTek), signed by both mt76 maintainers
- Fixes real functional issue for WiFi 7 MLD users
- Buggy code exists in 7.0

**AGAINST backporting:**
- Not a crash, security, or data corruption fix
- Only affects mt7996 MLD station mode users (relatively narrow)
- Relatively new code (v6.18+)

### Step 9.2: Stable Rules Checklist
1. Obviously correct and tested? **Yes** — trivially correct 1-line
   addition
2. Fixes a real bug? **Yes** — incorrect WiFi capability advertisement
3. Important issue? **Medium** — functional correctness for WiFi MLD
4. Small and contained? **Yes** — 1 line in 1 file
5. No new features? **Correct** — fixes existing code
6. Can apply to stable? **Yes** — clean apply expected

### Step 9.3: Exception Categories
Not an exception category; this is a standard driver bug fix.

### Step 9.4: Decision
The fix is a genuine 1-line correctness fix that corrects incorrect WiFi
MLD capability advertisement for mt7996 hardware. It's written by the
chip vendor, reviewed by both subsystem maintainers, and has zero
regression risk. While it's not a crash or security fix, it corrects
real WiFi functionality for mt7996 users in MLD/STR mode. The extremely
small scope and zero risk make this appropriate for stable.

---

## Verification

- [Phase 1] Parsed tags: Author from MediaTek, co-signed by mt76 co-
  maintainer and maintainer
- [Phase 2] Diff analysis: 1 line added to static const initializer,
  adds missing `IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND` field
- [Phase 3] git blame: Station MLD caps introduced by `f5160304d57c55`
  (v6.18), confirmed via `git merge-base`
- [Phase 3] git merge-base: Buggy commit NOT in v6.17 or earlier, IS in
  v6.18 and v7.0
- [Phase 5] grep callers: `mld_capa_and_ops` consumed in
  `net/mac80211/mlme.c` lines 2069 and 10612-10613 for association
  frames
- [Phase 5] grep definition: `IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND` =
  0x0f80 in `include/linux/ieee80211-eht.h`
- [Phase 6] Buggy code only in v6.18+ and v7.0; irrelevant for older LTS
  trees
- [Phase 7] Peter Chiu confirmed as MediaTek contributor; Lorenzo
  Bianconi and Felix Fietkau are mt76 maintainers
- UNVERIFIED: Could not access lore.kernel.org discussion due to Anubis
  protection

**YES**

 drivers/net/wireless/mediatek/mt76/mt7996/init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/init.c b/drivers/net/wireless/mediatek/mt76/mt7996/init.c
index 00a8286bd1368..d940ca84bd0f1 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/init.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/init.c
@@ -85,6 +85,7 @@ static const struct wiphy_iftype_ext_capab iftypes_ext_capa[] = {
 		.extended_capabilities_mask = if_types_ext_capa_ap,
 		.extended_capabilities_len = sizeof(if_types_ext_capa_ap),
 		.mld_capa_and_ops =
+			FIELD_PREP_CONST(IEEE80211_MLD_CAP_OP_FREQ_SEP_TYPE_IND, 1) |
 			FIELD_PREP_CONST(IEEE80211_MLD_CAP_OP_MAX_SIMUL_LINKS,
 					 MT7996_MAX_RADIOS - 1),
 	},
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process()
       [not found] <20260420132314.1023554-1-sashal@kernel.org>
                   ` (15 preceding siblings ...)
  2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix frequency separation for station STR mode Sasha Levin
@ 2026-04-20 13:22 ` Sasha Levin
  16 siblings, 0 replies; 17+ messages in thread
From: Sasha Levin @ 2026-04-20 13:22 UTC (permalink / raw)
  To: patches, stable
  Cc: Lorenzo Bianconi, Simon Horman, Jakub Kicinski, Sasha Levin,
	andrew+netdev, davem, edumazet, pabeni, linux-arm-kernel,
	linux-mediatek, netdev, linux-kernel

From: Lorenzo Bianconi <lorenzo@kernel.org>

[ Upstream commit 285fa6b1e03cff78ead0383e1b259c44b95faf90 ]

If an error occurs on the subsequents buffers belonging to the
non-linear part of the skb (e.g. due to an error in the payload length
reported by the NIC or if we consumed all the available fragments for
the skb), the page_pool fragment will not be linked to the skb so it will
not return to the pool in the airoha_qdma_rx_process() error path. Fix the
memory leak partially reverting commit 'd6d2b0e1538d ("net: airoha: Fix
page recycling in airoha_qdma_rx_process()")' and always running
page_pool_put_full_page routine in the airoha_qdma_rx_process() error
path.

Fixes: d6d2b0e1538d ("net: airoha: Fix page recycling in airoha_qdma_rx_process()")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260402-airoha_qdma_rx_process-mem-leak-fix-v1-1-b5706f402d3c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Error: Failed to generate final synthesis

 drivers/net/ethernet/airoha/airoha_eth.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 4fc6bd282b465..bdf600fea9508 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -709,9 +709,8 @@ static int airoha_qdma_rx_process(struct airoha_queue *q, int budget)
 		if (q->skb) {
 			dev_kfree_skb(q->skb);
 			q->skb = NULL;
-		} else {
-			page_pool_put_full_page(q->page_pool, page, true);
 		}
+		page_pool_put_full_page(q->page_pool, page, true);
 	}
 	airoha_qdma_fill_rx_queue(q);
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-04-20 13:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260420132314.1023554-1-sashal@kernel.org>
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.19] wifi: mt76: avoid to set ACK for MCU command if wait_resp is not set Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.18] phy: phy-mtk-tphy: Update names and format of kernel-doc comments Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] Bluetooth: btmtk: add MT7902 MCU support Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: flush pending TX before channel switch Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: fix list corruption in mt76_wcid_cleanup Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: add missing lock protection in mt76_sta_state for sta_event callback Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.1] Bluetooth: btmtk: improve mt79xx firmware setup retry flow Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: Disable Rx hdr_trans in monitor mode Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: Skip scan process during suspend Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] wifi: mt76: mt76x02: wake queues after reconfig Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.12] wifi: mt76: mt7925: resolve link after acquiring mt76 mutex Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix queue pause after scan due to wrong channel switch reason Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628 Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.6] wifi: mt76: mt7996: reset device after MCU message timeout Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-5.10] drm/mediatek: mtk_dsi: enable hs clock during pre-enable Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-6.18] wifi: mt76: mt7996: fix frequency separation for station STR mode Sasha Levin
2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process() Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox