* [PATCH] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration loops @ 2025-12-31 5:29 Zac Bowling 2025-12-31 22:37 ` [PATCH] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort paths Zac Bowling 0 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2025-12-31 5:29 UTC (permalink / raw) To: linux-wireless Cc: lorenzo, nbd, ryder.lee, kvalo, sean.wang, deren.wu, linux-mediatek, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1219 bytes --] I was getting a kernel panic on my new Framework Desktop running Ubuntu 25.10 with this specific WIFI chipset. mt792x_vif_to_bss_conf() can return NULL when iterating over valid_links during HW reset or other state transitions, because the link configuration in mac80211 may not be set up yet even though the driver's valid_links bitmap has the link marked as valid. This causes a NULL pointer dereference in mt76_connac_mcu_uni_add_dev() when it tries to access bss_conf->vif->type, and similar crashes in other functions that use bss_conf without checking. The crash manifests as: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib] Call Trace: mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] __iterate_interfaces+0x92/0x130 [mac80211] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] Add NULL checks for bss_conf in all loops that iterate over valid_links and call mt792x_vif_to_bss_conf(), skipping links where the mac80211 link configuration is not yet available. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> [-- Attachment #2: 0001-wifi-mt76-mt7925-fix-NULL-pointer-dereference-in-vif.patch --] [-- Type: application/octet-stream, Size: 3808 bytes --] From 6790e656030fb23527aa5c0d6eaa28ce029335b1 Mon Sep 17 00:00:00 2001 From: Zac Bowling <zac@zacbowling.com> Date: Tue, 30 Dec 2025 20:32:56 -0800 Subject: [PATCH] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration loops mt792x_vif_to_bss_conf() can return NULL when iterating over valid_links during HW reset or other state transitions, because the link configuration in mac80211 may not be set up yet even though the driver's valid_links bitmap has the link marked as valid. This causes a NULL pointer dereference in mt76_connac_mcu_uni_add_dev() when it tries to access bss_conf->vif->type, and similar crashes in other functions that use bss_conf without checking. The crash manifests as: BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib] Call Trace: mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] __iterate_interfaces+0x92/0x130 [mac80211] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] Add NULL checks for bss_conf in all loops that iterate over valid_links and call mt792x_vif_to_bss_conf(), skipping links where the mac80211 link configuration is not yet available. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 6 ++++++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 871b67101..184efe8af 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 2d358a966..3001a62a8 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1304,6 +1304,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } mt792x_mutex_release(dev); @@ -1630,6 +1632,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1861,6 +1865,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1876,6 +1882,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort paths 2025-12-31 5:29 [PATCH] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration loops Zac Bowling @ 2025-12-31 22:37 ` Zac Bowling 2026-01-01 0:22 ` [PATCH 2/3] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling 2026-01-01 0:23 ` [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling 0 siblings, 2 replies; 113+ messages in thread From: Zac Bowling @ 2025-12-31 22:37 UTC (permalink / raw) To: zac Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> This patch is a follow-up to the NULL pointer dereference fix (commit 6790e656030fb23527aa5c0d6eaa28ce029335b1). While that patch prevented kernel panics from NULL pointer dereferences, it did not address the underlying system hangs and deadlocks that occur during firmware recovery. The issue manifests on Framework Desktop systems with MT7925 WiFi cards when: 1. Switching between WiFi networks 2. Disconnecting/reconnecting ethernet while WiFi is active 3. Firmware message timeouts trigger hardware reset recovery During these operations, MCU message timeouts can occur, triggering mt792x_reset() which queues reset_work. The reset work and ROC abort functions iterate over active interfaces and call MCU functions that require the device mutex to be held, but the mutex was not acquired before the iteration. This causes system-wide hangs where: - Network commands (ip, etc.) hang indefinitely - Processes get stuck in uninterruptible sleep (D state) - Tailscale and other network services timeout - System becomes completely unresponsive requiring force reboot The hang occurs because: 1. Firmware timeouts trigger hardware reset via mt792x_reset() 2. Reset work (mt7925_mac_reset_work) or ROC abort (mt7925_roc_abort_sync) tries to iterate interfaces and call MCU functions 3. MCU operations block indefinitely waiting for mutex that's held elsewhere, or deadlock occurs 4. Network stack becomes unresponsive Add mutex protection around interface iteration in both: - mt7925_mac_reset_work(): Called during firmware recovery after MCU timeouts to reconnect all interfaces - mt7925_roc_abort_sync(): Called during suspend/resume and when aborting Remain On Channel operations This matches the pattern used elsewhere in the driver (e.g., in mt7925_roc_iter, mt7925_mcu_set_suspend_iter, etc.) where interface iteration callbacks invoke MCU functions. Note: The author does not have deep familiarity with this codebase, but this fix has been tested and appears to resolve the panic and deadlock issues observed on Framework Desktop hardware with MT7925 WiFi cards. Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 184efe8afa10..06420ac6ed55 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1331,9 +1331,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 3001a62a8b67..1f7661175623 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -459,10 +459,13 @@ void mt7925_roc_abort_sync(struct mt792x_dev *dev) timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); - if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) { + mt792x_mutex_acquire(dev); ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_roc_iter, (void *)phy); + mt792x_mutex_release(dev); + } } EXPORT_SYMBOL_GPL(mt7925_roc_abort_sync); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 2/3] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort 2025-12-31 22:37 ` [PATCH] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort paths Zac Bowling @ 2026-01-01 0:22 ` Zac Bowling 2026-01-01 0:23 ` [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 0:22 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, zac From: Zac Bowling <zac@zacbowling.com> During firmware recovery and ROC (Remain On Channel) abort operations, the driver iterates over active interfaces and calls MCU functions that require the device mutex to be held, but the mutex was not acquired. This causes system-wide hangs where network commands hang indefinitely, processes get stuck in uninterruptible sleep (D state), and the system becomes completely unresponsive requiring force reboot. Add mutex protection around interface iteration in: - mt7925_mac_reset_work(): Called during firmware recovery after MCU timeouts to reconnect all interfaces - mt7925_roc_abort_sync(): Called during suspend/resume and when aborting Remain On Channel operations This matches the pattern used elsewhere in the driver where interface iteration callbacks invoke MCU functions. Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 184efe8afa10..06420ac6ed55 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1331,9 +1331,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index c4161754c01d..e9d62c6aee91 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -455,7 +455,9 @@ static int mt7925_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7925_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM 2025-12-31 22:37 ` [PATCH] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort paths Zac Bowling 2026-01-01 0:22 ` [PATCH 2/3] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling @ 2026-01-01 0:23 ` Zac Bowling 2026-01-01 0:41 ` Zac Bowling 1 sibling, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-01 0:23 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, zac From: Zac Bowling <zac@zacbowling.com> Two additional code paths were identified that iterate over active interfaces and call MCU functions without proper mutex protection: 1. mt7925_set_runtime_pm(): Called when runtime PM settings change. The callback mt7925_pm_interface_iter() calls mt7925_mcu_set_beacon_filter() which in turn calls mt7925_mcu_set_rxfilter(). These MCU functions require the device mutex to be held. 2. mt7925_mlo_pm_work(): A workqueue function for MLO power management. The callback mt7925_mlo_pm_iter() was acquiring mutex internally, which is inconsistent with the rest of the driver where the caller holds the mutex during interface iteration. Move the mutex to the caller for consistency and to prevent potential race conditions. The impact of these bugs: - mt7925_set_runtime_pm(): Can cause deadlocks when power management settings are changed while WiFi is active - mt7925_mlo_pm_work(): Can cause race conditions during MLO power save state transitions Note: Similar bugs exist in the mt7921 driver and should be fixed in a separate patch series. Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 3001a62a8b67..9f17b21aef1c 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -751,9 +751,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } @@ -1301,14 +1303,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) return; - mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); if (!bss_conf) continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } - mt792x_mutex_release(dev); } void mt7925_mlo_pm_work(struct work_struct *work) @@ -1317,9 +1317,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) mlo_pm_work.work); struct ieee80211_hw *hw = mt76_hw(dev); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_mlo_pm_iter, dev); + mt792x_mutex_release(dev); } void mt7925_scan_work(struct work_struct *work) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM 2026-01-01 0:23 ` [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling @ 2026-01-01 0:41 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling ` (3 more replies) 0 siblings, 4 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 0:41 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Note, this is an update to the original patch I sent. In this v2 patch, I moved mutex protection from inside mt7925_roc_abort_sync() to the call site in pci.c. The previous approach caused a self-deadlock when roc_abort_sync was called from the station remove path, which already holds the mutex. I created a repo with all these patches if that makes it easier: https://github.com/zbowling/mt7925 These bugs also exist in the mt7921 driver, which this mt7925 driver seems to fork from. These lock patterns match the much older mt7615 driver and other wifi drivers. Zac Bowling Zac Bowling On Wed, Dec 31, 2025 at 4:23 PM Zac Bowling <zbowling@gmail.com> wrote: > > From: Zac Bowling <zac@zacbowling.com> > > Two additional code paths were identified that iterate over active > interfaces and call MCU functions without proper mutex protection: > > 1. mt7925_set_runtime_pm(): Called when runtime PM settings change. > The callback mt7925_pm_interface_iter() calls mt7925_mcu_set_beacon_filter() > which in turn calls mt7925_mcu_set_rxfilter(). These MCU functions require > the device mutex to be held. > > 2. mt7925_mlo_pm_work(): A workqueue function for MLO power management. > The callback mt7925_mlo_pm_iter() was acquiring mutex internally, which > is inconsistent with the rest of the driver where the caller holds the > mutex during interface iteration. Move the mutex to the caller for > consistency and to prevent potential race conditions. > > The impact of these bugs: > - mt7925_set_runtime_pm(): Can cause deadlocks when power management > settings are changed while WiFi is active > - mt7925_mlo_pm_work(): Can cause race conditions during MLO power save > state transitions > > Note: Similar bugs exist in the mt7921 driver and should be fixed in a > separate patch series. > > Reported-by: Zac Bowling <zac@zacbowling.com> > Tested-by: Zac Bowling <zac@zacbowling.com> > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > index 3001a62a8b67..9f17b21aef1c 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > @@ -751,9 +751,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) > bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); > > pm->enable = pm->enable_user && !monitor; > + mt792x_mutex_acquire(dev); > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7925_pm_interface_iter, dev); > + mt792x_mutex_release(dev); > pm->ds_enable = pm->ds_enable_user && !monitor; > mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); > } > @@ -1301,14 +1303,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) > if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) > return; > > - mt792x_mutex_acquire(dev); > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > bss_conf = mt792x_vif_to_bss_conf(vif, i); > if (!bss_conf) > continue; > mt7925_mcu_uni_bss_ps(dev, bss_conf); > } > - mt792x_mutex_release(dev); > } > > void mt7925_mlo_pm_work(struct work_struct *work) > @@ -1317,9 +1317,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) > mlo_pm_work.work); > struct ieee80211_hw *hw = mt76_hw(dev); > > + mt792x_mutex_acquire(dev); > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7925_mlo_pm_iter, dev); > + mt792x_mutex_release(dev); > } > > void mt7925_scan_work(struct work_struct *work) > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions 2026-01-01 0:41 ` Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling ` (2 subsequent siblings) 3 siblings, 2 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks for link_conf and mconf in: - mt7925_mcu_sta_phy_tlv(): builds PHY capability TLV for station record - mt7925_mcu_sta_rate_ctrl_tlv(): builds rate control TLV for station record Both functions call mt792x_vif_to_bss_conf() and mt792x_vif_to_link() which can return NULL during MLO link state transitions when the link configuration in mac80211 is not yet synchronized with the driver's link tracking. Without these checks, the driver will crash with a NULL pointer dereference when accessing link_conf->chanreq.oper or link_conf->basic_rates. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index cf0fdea45cf7..d61a7fbda745 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1773,6 +1773,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1851,6 +1855,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks throughout main.c for functions that call mt792x_vif_to_bss_conf(), mt792x_vif_to_link(), and mt792x_sta_to_link() without verifying the return value before dereferencing. Functions fixed: - mt7925_set_key(): Check link_conf, mconf, and mlink before use - mt7925_mac_link_sta_add(): Check link_conf before BSS info update - mt7925_mac_link_sta_assoc(): Check mlink and link_conf before use - mt7925_mac_link_sta_remove(): Check mlink and link_conf, add goto label for proper cleanup path - mt7925_change_vif_links(): Check link_conf before adding BSS These functions can receive NULL when the link configuration in mac80211 is not yet synchronized with the driver's link tracking during MLO operations or state transitions. Without these checks, the driver will crash with NULL pointer dereferences during station add/remove/association operations. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 27 ++++++++++++++++--- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 9f17b21aef1c..7d3322461bcf 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -604,6 +604,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) + return -EINVAL; + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -889,6 +893,8 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -1034,6 +1040,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1043,12 +1051,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1095,6 +1104,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1108,10 +1119,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1119,6 +1132,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -2031,6 +2045,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks for mconf and link_conf in several functions that were missing validation after calling mt792x_vif_to_link() and mt792x_vif_to_bss_conf(). Functions fixed: - mt7925_mac_set_links(): Check both primary and secondary link_conf before dereferencing chanreq.oper for band selection - mt7925_link_info_changed(): Check mconf before using it to get link_conf, prevents NULL dereference chain - mt7925_assign_vif_chanctx(): Check mconf before use, return -EINVAL if NULL; check pri_link_conf before passing to MCU function - mt7925_unassign_vif_chanctx(): Check mconf before dereferencing, return early if NULL during MLO cleanup These functions handle MLO (Multi-Link Operation) scenarios where link configurations may not be fully set up when called, particularly during rapid link state transitions or error recovery paths. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 39 +++++++++++++++---- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 058394b2e067..852cf8ff842f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1006,18 +1006,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; + + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; + + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1927,7 +1938,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2136,9 +2152,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2167,6 +2188,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands 2026-01-01 0:41 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling 3 siblings, 2 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Check return values of mt7925_mcu_uni_rx_ba() and mt7925_mcu_uni_tx_ba() in mt7925_ampdu_action() and propagate errors to the caller. Previously, failures in these MCU commands were silently ignored, which could leave block aggregation in an inconsistent state between the driver and firmware. For IEEE80211_AMPDU_TX_STOP_CONT, only call the completion callback ieee80211_stop_tx_ba_cb_irqsafe() if the MCU command succeeded, to avoid signaling completion when the firmware operation failed. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7d3322461bcf..d966e5ab50ff 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1271,22 +1271,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1295,8 +1295,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Check return value of mt7925_mcu_add_bss_info() in mt7925_mac_link_sta_add() and propagate errors to the caller. BSS info must be set up before adding a station record. If this MCU command fails, continuing with station add would leave the firmware in an inconsistent state with a station but no BSS configuration. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index d966e5ab50ff..a7e1e673c4bc 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -899,11 +899,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add error handling for BSS info in key setup 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Check return value of mt7925_mcu_add_bss_info() in mt7925_set_key_link() when setting up cipher for the first time and propagate errors. The BSS info update with cipher information must succeed before key programming can proceed. If this MCU command fails, continuing with key setup would program keys into the firmware for a BSS that doesn't have the correct cipher configuration. Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index a7e1e673c4bc..058394b2e067 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -637,8 +637,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7921: fix missing mutex protection in multiple paths 2026-01-01 0:41 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling 3 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> The MT7921 driver has the same mutex protection bugs as MT7925 - they were inherited when MT7925 was forked from MT7921. Several code paths iterate over active interfaces and call MCU functions without proper mutex protection. Add mutex protection in the following locations: 1. mt7921_set_runtime_pm() in main.c: Called when runtime PM settings change. The callback mt7921_pm_interface_iter() calls MCU functions that require the device mutex to be held. 2. mt7921_regd_set_6ghz_power_type() in main.c: Called during VIF add/remove for 6GHz power type determination. Uses ieee80211_iterate_active_interfaces() without mutex. 3. mt7921_mac_reset_work() in mac.c: After firmware recovery, iterates interfaces to reconnect them. The mt7921_vif_connect_iter() callback calls MCU functions. 4. PCI/SDIO suspend paths (pci.c, sdio.c): The mt7921_roc_abort_sync() call iterates interfaces without mutex protection. These bugs can cause system hangs during: - Power management state transitions - WiFi reset/recovery - Suspend/resume cycles - 6GHz regulatory power type changes The fix follows the same pattern used in the MT7925 patches. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/main.c | 4 ++++ drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/sdio.c | 2 ++ 4 files changed, 10 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c index 03b4960db73f..f5c882e45bbe 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c @@ -693,9 +693,11 @@ void mt7921_mac_reset_work(struct work_struct *work) clear_bit(MT76_RESET, &dev->mphy.state); pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 5fae9a6e273c..05793a786644 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -619,9 +619,11 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable); } @@ -765,9 +767,11 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) struct mt792x_dev *dev = phy->dev; u32 valid_vif_num = 0; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_calc_vif_num, &valid_vif_num); + mt792x_mutex_release(dev); if (valid_vif_num > 1) { phy->power_type = MT_AP_DEFAULT; diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index ec9686183251..9f76b334b93d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 3421e53dc948..92ea2811816f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,7 +219,9 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-01 0:41 ` Zac Bowling ` (2 preceding siblings ...) 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac Bowling @ 2026-01-01 6:25 ` Zac Bowling 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling 3 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-01 6:25 UTC (permalink / raw) To: linux-wireless Cc: linux-mediatek, linux-kernel, kvalo, lorenzo, nbd, sean.wang, deren.wu, ryder.lee From: Zac Bowling <zac@zacbowling.com> Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index d61a7fbda745..958ff9da9f01 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1527,6 +1527,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2037,6 +2039,8 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, struct mt792x_sta *msta; struct mt792x_link_sta *mlink; + lockdep_assert_held(&dev->mt76.mutex); + if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); @@ -2843,6 +2847,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling ` (6 more replies) 0 siblings, 7 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> This series contains additional fixes for the MT7925 WiFi driver that address issues discovered through further testing and static analysis. These patches build on my previous series [1] and address the remaining stability issues I've encountered on the Framework Desktop. Changes since v1: - 6 new patches addressing MLO roaming, firmware recovery, and resume path Summary of fixes: Patch 12: Fix key removal failure during MLO roaming During MLO link teardown, mac80211 may request key removal after driver state is already cleaned up. Return success instead of -EINVAL when the link is already gone, as the key is effectively removed. Patch 13: Fix kernel warning in MLO ROC setup Replace WARN_ON_ONCE() with proper NULL checks in mt7925_mcu_set_mlo_roc(). During MLO AP setup, the channel may not be configured yet when this function is called. Return -ENOLINK instead of triggering a warning. Patch 14: Add NULL checks for MLO link pointers in MCU functions Several MCU functions dereference mt792x_sta_to_link() and mt792x_vif_to_link() without checking for NULL. Add defensive checks in sta_hdr_trans_tlv, wtbl_update_hdr_trans, sta_amsdu_tlv, sta_mld_tlv, and sta_update. Patch 15: Fix firmware reload after previous load crash (mt792x) Backport the MT7915 fix (commit 79dd14f) to MT792x. If firmware loading crashes after acquiring the patch semaphore, subsequent loads fail with "Failed to get patch semaphore". Release the semaphore and restart MCU before loading to ensure clean state. Patch 16: Add mutex protection in resume path The resume path was missing mutex protection around mt7925_mcu_set_deep_sleep() and mt7925_regd_update() calls. Found by static analysis (sparse/coccinelle). Patch 17: Add NULL checks and error handling Add NULL checks in mt7925_mac_link_sta_add() and mt7925_conf_tx(). Add error logging for MCU calls in mt7925_regd_update() to help diagnose regulatory domain update failures. These fixes have been tested on a Framework Desktop (AMD Ryzen AI Max 300) with the MT7925 (RZ717) WiFi card. The system is now stable through suspend/resume cycles, MLO roaming, and firmware recovery scenarios that previously caused crashes or hangs. [1] https://lore.kernel.org/all/20260101062543.186499-1-zbowling@gmail.com/ Zac Bowling (6): wifi: mt76: mt7925: fix key removal failure during MLO roaming wifi: mt76: mt7925: fix kernel warning in MLO ROC setup wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU wifi: mt76: mt792x: fix firmware reload after failed load wifi: mt76: mt7925: add mutex protection in resume path wifi: mt76: mt7925: add NULL checks and error handling drivers/net/wireless/mediatek/mt76/mt7925/init.c | 13 ++- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 19 +++- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 45 +++++--- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 + drivers/net/wireless/mediatek/mt76/mt792x_core.c | 14 +++ 5 files changed, 75 insertions(+), 18 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: fix key removal failure during MLO roaming 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup when channel not configured Zac Bowling ` (5 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> During MLO roaming, mac80211 may request key removal after the link state has already been torn down. The current code returns -EINVAL when link_conf, mconf, or mlink is NULL, causing 'failed to remove key from hardware (-22)' errors in the kernel log. This is a race condition where: 1. MLO link teardown begins, cleaning up driver state 2. mac80211 requests group key removal for the old link 3. mt792x_vif_to_bss_conf() or related functions return NULL 4. Driver returns -EINVAL, confusing upper layers The fix: When removing a key (cmd != SET_KEY), if the link state is already gone, return success (0) instead of error. The key is effectively removed when the link was torn down. This prevents the following errors during roaming: wlp192s0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-22) wlp192s0: failed to remove key (4, ff:ff:ff:ff:ff:ff) from hardware (-22) And the associated wpa_supplicant warnings: nl80211: kernel reports: link ID must for MLO group key Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 13156333431d..11c0197c7426 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -597,8 +597,15 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); - if (!link_conf || !mconf || !mlink) + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; return -EINVAL; + } wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup when channel not configured 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling ` (4 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang mt7925_mcu_set_mlo_roc() uses WARN_ON_ONCE() to check if link_conf or channel is NULL. However, during MLO AP setup, it's normal for the channel to not be configured yet when this function is called. The WARN_ON_ONCE triggers a kernel warning/oops that makes the system appear to have crashed, even though it's just a timing issue. Replace WARN_ON_ONCE with regular NULL checks and return -ENOLINK to indicate the link isn't fully configured yet. This allows the upper layers to retry when the link is ready, without spamming the kernel log with warnings. Also add a check for mconf in the first loop to match the pattern used in the second loop, preventing potential NULL dereference. This fixes kernel oops reported during MLO AP setup on OpenWrt with MT7925E hardware. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- mt7925/mcu.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/mt7925/mcu.c b/mt7925/mcu.c index bd38807e..b0bbeb5a 100644 --- a/mt7925/mcu.c +++ b/mt7925/mcu.c @@ -1337,15 +1337,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1359,8 +1367,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup when channel not configured Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling ` (3 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Several MCU functions dereference pointers returned by mt792x_sta_to_link() and mt792x_vif_to_link() without checking for NULL. During MLO state transitions, these functions can return NULL when link state is being set up or torn down, causing kernel NULL pointer dereferences. Add NULL checks in the following functions: - mt7925_mcu_sta_hdr_trans_tlv(): Check mlink before dereferencing wcid - mt7925_mcu_wtbl_update_hdr_trans(): Check mlink and mconf before use - mt7925_mcu_sta_amsdu_tlv(): Check mlink before setting amsdu flag - mt7925_mcu_sta_mld_tlv(): Check mconf and mlink in link iteration loop - mt7925_mcu_sta_update(): Initialize mlink to NULL and check both link_sta and mlink in the ternary condition These race conditions can occur during: - MLO link setup/teardown - Station add/remove operations - Firmware command generation during state transitions The fixes follow the pattern used in mt7996 and ath12k drivers for similar MLO link state handling. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- mt7925/mcu.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mt7925/mcu.c b/mt7925/mcu.c index bd38807e..b9c4b99d 100644 --- a/mt7925/mcu.c +++ b/mt7925/mcu.c @@ -1087,6 +1087,8 @@ mt7925_mcu_sta_hdr_trans_tlv(struct sk_buff *skb, struct mt792x_link_sta *mlink; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; wcid = &mlink->wcid; } else { wcid = &mvif->sta.deflink.wcid; @@ -1120,6 +1122,9 @@ int mt7925_mcu_wtbl_update_hdr_trans(struct mt792x_dev *dev, link_sta = mt792x_sta_to_link_sta(vif, sta, link_id); mconf = mt792x_vif_to_link(mvif, link_id); + if (!mlink || !mconf) + return -EINVAL; + skb = __mt76_connac_mcu_alloc_sta_req(&dev->mt76, &mconf->mt76, &mlink->wcid, MT7925_STA_UPDATE_MAX_SIZE); @@ -1741,6 +1746,8 @@ mt7925_mcu_sta_amsdu_tlv(struct sk_buff *skb, amsdu->amsdu_en = true; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mlink->wcid.amsdu = true; switch (link_sta->agg.max_amsdu_len) { @@ -1935,6 +1942,9 @@ mt7925_mcu_sta_mld_tlv(struct sk_buff *skb, mconf = mt792x_vif_to_link(mvif, i); mlink = mt792x_sta_to_link(msta, i); + if (!mconf || !mlink) + continue; + mld->link[cnt].wlan_id = cpu_to_le16(mlink->wcid.idx); mld->link[cnt++].bss_idx = mconf->mt76.idx; @@ -2027,13 +2037,13 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling ` (2 preceding siblings ...) 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-03 6:46 ` Sean Wang 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling ` (2 subsequent siblings) 6 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang If the firmware loading process crashes or is interrupted after acquiring the patch semaphore but before releasing it, subsequent firmware load attempts will fail with 'Failed to get patch semaphore' because the semaphore is still held. This issue manifests as devices becoming unusable after suspend/resume failures or firmware crashes, requiring a full hardware reboot to recover. This has been widely reported on MT7921 and MT7925 devices. Apply the same fix that was applied to MT7915 in commit 79dd14f: 1. Release the patch semaphore before starting firmware load (in case it was held by a previous failed attempt) 2. Restart MCU firmware to ensure clean state 3. Wait briefly for MCU to be ready This fix applies to both MT7921 and MT7925 drivers which share the mt792x_load_firmware() function. Fixes: 'Failed to get patch semaphore' errors after firmware crash Signed-off-by: Zac Bowling <zac@zacbowling.com> --- mt792x_core.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mt792x_core.c b/mt792x_core.c index cc488ee9..b82e4470 100644 --- a/mt792x_core.c +++ b/mt792x_core.c @@ -927,6 +927,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) { int ret; + /* Release semaphore if taken by previous failed load attempt. + * This prevents "Failed to get patch semaphore" errors when + * recovering from firmware crashes or suspend/resume failures. + */ + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); + if (ret < 0) + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); + + /* Always restart MCU to ensure clean state before loading firmware */ + mt76_connac_mcu_restart(&dev->mt76); + + /* Wait for MCU to be ready after restart */ + msleep(100); + ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); if (ret) return ret; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling @ 2026-01-03 6:46 ` Sean Wang 2026-01-03 18:42 ` Zac Bowling 0 siblings, 1 reply; 113+ messages in thread From: Sean Wang @ 2026-01-03 6:46 UTC (permalink / raw) To: Zac Bowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang On Fri, Jan 2, 2026 at 2:03 PM Zac Bowling <zbowling@gmail.com> wrote: > > If the firmware loading process crashes or is interrupted after > acquiring the patch semaphore but before releasing it, subsequent > firmware load attempts will fail with 'Failed to get patch semaphore' > because the semaphore is still held. > > This issue manifests as devices becoming unusable after suspend/resume > failures or firmware crashes, requiring a full hardware reboot to > recover. This has been widely reported on MT7921 and MT7925 devices. > > Apply the same fix that was applied to MT7915 in commit 79dd14f: > 1. Release the patch semaphore before starting firmware load (in case > it was held by a previous failed attempt) > 2. Restart MCU firmware to ensure clean state > 3. Wait briefly for MCU to be ready > > This fix applies to both MT7921 and MT7925 drivers which share the > mt792x_load_firmware() function. > > Fixes: 'Failed to get patch semaphore' errors after firmware crash > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > mt792x_core.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/mt792x_core.c b/mt792x_core.c > index cc488ee9..b82e4470 100644 > --- a/mt792x_core.c > +++ b/mt792x_core.c > @@ -927,6 +927,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) > { > int ret; > > + /* Release semaphore if taken by previous failed load attempt. > + * This prevents "Failed to get patch semaphore" errors when > + * recovering from firmware crashes or suspend/resume failures. > + */ > + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); > + if (ret < 0) > + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); > + > + /* Always restart MCU to ensure clean state before loading firmware */ > + mt76_connac_mcu_restart(&dev->mt76); > + > + /* Wait for MCU to be ready after restart */ > + msleep(100); > + Hi Zac, This is a good finding. Since this is a common mt792x code path, have you also had a chance to test it on MT7921? One small nit: the Fixes tag should reference the actual commit being fixed, e.g. Fixes: <commit-sha> ("mt76: mt792x: ...") instead of the error string. Sean > ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); > if (ret) > return ret; > -- > 2.51.0 > > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-03 6:46 ` Sean Wang @ 2026-01-03 18:42 ` Zac Bowling 2026-01-15 7:19 ` Zac Bowling 0 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-03 18:42 UTC (permalink / raw) To: Sean Wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi Sean, Thanks! I don't have a MT7921, only a MT7925, so no unfortunately. I ordered off Amazon and should be here in a week or two. Zac Bowling Zac Bowling On Fri, Jan 2, 2026 at 10:46 PM Sean Wang <sean.wang@kernel.org> wrote: > > On Fri, Jan 2, 2026 at 2:03 PM Zac Bowling <zbowling@gmail.com> wrote: > > > > If the firmware loading process crashes or is interrupted after > > acquiring the patch semaphore but before releasing it, subsequent > > firmware load attempts will fail with 'Failed to get patch semaphore' > > because the semaphore is still held. > > > > This issue manifests as devices becoming unusable after suspend/resume > > failures or firmware crashes, requiring a full hardware reboot to > > recover. This has been widely reported on MT7921 and MT7925 devices. > > > > Apply the same fix that was applied to MT7915 in commit 79dd14f: > > 1. Release the patch semaphore before starting firmware load (in case > > it was held by a previous failed attempt) > > 2. Restart MCU firmware to ensure clean state > > 3. Wait briefly for MCU to be ready > > > > This fix applies to both MT7921 and MT7925 drivers which share the > > mt792x_load_firmware() function. > > > > Fixes: 'Failed to get patch semaphore' errors after firmware crash > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > --- > > mt792x_core.c | 14 ++++++++++++++ > > 1 file changed, 14 insertions(+) > > > > diff --git a/mt792x_core.c b/mt792x_core.c > > index cc488ee9..b82e4470 100644 > > --- a/mt792x_core.c > > +++ b/mt792x_core.c > > @@ -927,6 +927,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) > > { > > int ret; > > > > + /* Release semaphore if taken by previous failed load attempt. > > + * This prevents "Failed to get patch semaphore" errors when > > + * recovering from firmware crashes or suspend/resume failures. > > + */ > > + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); > > + if (ret < 0) > > + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); > > + > > + /* Always restart MCU to ensure clean state before loading firmware */ > > + mt76_connac_mcu_restart(&dev->mt76); > > + > > + /* Wait for MCU to be ready after restart */ > > + msleep(100); > > + > > Hi Zac, > > This is a good finding. Since this is a common mt792x code path, have you > also had a chance to test it on MT7921? > > One small nit: the Fixes tag should reference the actual commit being > fixed, e.g. > > Fixes: <commit-sha> ("mt76: mt792x: ...") > > instead of the error string. > > Sean > > > ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); > > if (ret) > > return ret; > > -- > > 2.51.0 > > > > > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-03 18:42 ` Zac Bowling @ 2026-01-15 7:19 ` Zac Bowling 0 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-15 7:19 UTC (permalink / raw) To: Sean Wang, linux Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang While I'm still waiting for feedback from folks on these patches, I've set up a public repository with all the fixes for others experiencing these issues and created a DKMS package for folks so they can easily load these patches as an alternative driver since so many folks are running into these same problems on several popular commercial laptops and desktops: https://github.com/zbowling/mt7925 The repository has: - All 18 patches from this series I've sent here (different versions of these patches that apply cleanly to different kernel versions) - Pre-patched kernel branches (6.17.x, 6.18.x, 6.19-rc5) in another repo linked in the README - A new DKMS package for out-of-tree builds (requires kernel 6.17+) with various hacks with #ifdef kernel versions so that the single package works for all recent kernels. The DKMS package builds mt76, mt76-connac-lib, mt792x-lib, mt7925-common, and mt7925e modules with all fixes applied. Testing in the community with everyone experiencing these same panics in the current upstream version, I've heard feedback from many folks that this patch series (either just apply the patches or using the DKMS build) that this fixes most of their issues. There still seems to be ongoing issues inside the firmware related to MLO and deauths with certain APs (especially with my Unifi U7 Pros) but at least this keeps machines from crashing while it the chip resets so you only suffer momentary losses in connectivity instead of straight-up kernel panic or a deadlock. For anyone still hitting the NULL pointer dereferences, mutex deadlocks with NetworkManager and friends during MLO and deauth situations, or suspend/resume hangs with mt7925 - this DMKS package or these patches should greatly help. Happy to address any review feedback whenever you finally have a chance to look at these. Zac Bowling On Sat, Jan 3, 2026 at 10:42 AM Zac Bowling <zbowling@gmail.com> wrote: > > Hi Sean, > > Thanks! I don't have a MT7921, only a MT7925, so no unfortunately. I > ordered off Amazon and should be here in a week or two. > > Zac Bowling > > Zac Bowling > > > On Fri, Jan 2, 2026 at 10:46 PM Sean Wang <sean.wang@kernel.org> wrote: > > > > On Fri, Jan 2, 2026 at 2:03 PM Zac Bowling <zbowling@gmail.com> wrote: > > > > > > If the firmware loading process crashes or is interrupted after > > > acquiring the patch semaphore but before releasing it, subsequent > > > firmware load attempts will fail with 'Failed to get patch semaphore' > > > because the semaphore is still held. > > > > > > This issue manifests as devices becoming unusable after suspend/resume > > > failures or firmware crashes, requiring a full hardware reboot to > > > recover. This has been widely reported on MT7921 and MT7925 devices. > > > > > > Apply the same fix that was applied to MT7915 in commit 79dd14f: > > > 1. Release the patch semaphore before starting firmware load (in case > > > it was held by a previous failed attempt) > > > 2. Restart MCU firmware to ensure clean state > > > 3. Wait briefly for MCU to be ready > > > > > > This fix applies to both MT7921 and MT7925 drivers which share the > > > mt792x_load_firmware() function. > > > > > > Fixes: 'Failed to get patch semaphore' errors after firmware crash > > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > > --- > > > mt792x_core.c | 14 ++++++++++++++ > > > 1 file changed, 14 insertions(+) > > > > > > diff --git a/mt792x_core.c b/mt792x_core.c > > > index cc488ee9..b82e4470 100644 > > > --- a/mt792x_core.c > > > +++ b/mt792x_core.c > > > @@ -927,6 +927,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) > > > { > > > int ret; > > > > > > + /* Release semaphore if taken by previous failed load attempt. > > > + * This prevents "Failed to get patch semaphore" errors when > > > + * recovering from firmware crashes or suspend/resume failures. > > > + */ > > > + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); > > > + if (ret < 0) > > > + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); > > > + > > > + /* Always restart MCU to ensure clean state before loading firmware */ > > > + mt76_connac_mcu_restart(&dev->mt76); > > > + > > > + /* Wait for MCU to be ready after restart */ > > > + msleep(100); > > > + > > > > Hi Zac, > > > > This is a good finding. Since this is a common mt792x code path, have you > > also had a chance to test it on MT7921? > > > > One small nit: the Fixes tag should reference the actual commit being > > fixed, e.g. > > > > Fixes: <commit-sha> ("mt76: mt792x: ...") > > > > instead of the error string. > > > > Sean > > > > > ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); > > > if (ret) > > > return ret; > > > -- > > > 2.51.0 > > > > > > > > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add mutex protection in resume path 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling ` (3 preceding siblings ...) 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks and error handling for MCU calls Zac Bowling 2026-01-02 20:05 ` [PATCH] wifi: mt76: mt7925: comprehensive stability fixes Zac Bowling 6 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> Add mutex protection around mt7925_mcu_set_deep_sleep() and mt7925_regd_update() calls in the resume path to prevent potential race conditions during resume operations. These MCU operations require serialization, and the resume path was the only call site missing mutex protection. Found by static analysis (sparse/coccinelle). --- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index ca868619e1b7..b6c90c5f7e91 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -583,10 +583,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) } /* restore previous ds setting */ + mt792x_mutex_acquire(dev); if (!pm->ds_enable) mt7925_mcu_set_deep_sleep(dev, false); mt7925_regd_update(dev); + mt792x_mutex_release(dev); failed: pm->suspended = false; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: add NULL checks and error handling for MCU calls 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling ` (4 preceding siblings ...) 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling @ 2026-01-02 20:03 ` Zac Bowling 2026-01-02 20:05 ` [PATCH] wifi: mt76: mt7925: comprehensive stability fixes Zac Bowling 6 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:03 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks for mt792x_sta_to_link() and mt792x_vif_to_link() results in critical paths to prevent kernel crashes during MLO operations. Add error logging for MCU return values in mt7925_regd_update() to help diagnose regulatory domain update failures. Found by static analysis review. --- drivers/net/wireless/mediatek/mt76/mt7925/init.c | 13 ++++++++++--- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/init.c b/drivers/net/wireless/mediatek/mt76/mt7925/init.c index d7d5afe365ed..f800112ccaf7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/init.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/init.c @@ -162,10 +162,17 @@ void mt7925_regd_update(struct mt792x_dev *dev) if (!dev->regd_change) return; - mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env); + if (mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env) < 0) + dev_warn(dev->mt76.dev, "Failed to set CLC\n"); + mt7925_regd_channel_update(wiphy, dev); - mt7925_mcu_set_channel_domain(hw->priv); - mt7925_set_tx_sar_pwr(hw, NULL); + + if (mt7925_mcu_set_channel_domain(hw->priv) < 0) + dev_warn(dev->mt76.dev, "Failed to set channel domain\n"); + + if (mt7925_set_tx_sar_pwr(hw, NULL) < 0) + dev_warn(dev->mt76.dev, "Failed to set TX SAR power\n"); + dev->regd_change = false; } EXPORT_SYMBOL_GPL(mt7925_regd_update); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 11c0197c7426..b6e3002faf41 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -863,12 +863,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -1750,6 +1755,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH] wifi: mt76: mt7925: comprehensive stability fixes 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling ` (5 preceding siblings ...) 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks and error handling for MCU calls Zac Bowling @ 2026-01-02 20:05 ` Zac Bowling 2026-01-03 6:25 ` Sean Wang 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling 6 siblings, 2 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-02 20:05 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> This unified patch combines all MT7925 driver fixes for kernel stability: 1. NULL pointer dereference fixes in vif iteration, TX path, and MCU functions 2. Missing mutex protection in reset, ROC, PM, and resume paths 3. Error handling for MCU commands (AMPDU, BSS info, key setup) 4. lockdep assertions for debugging 5. MLO (Multi-Link Operation) improvements for roaming and AP mode 6. Firmware reload recovery after crashes These fixes address kernel panics and system hangs that occur during: - WiFi network switching and BSSID roaming - Suspend/resume cycles - MLO link state transitions - Firmware recovery after crashes Tested on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 (RZ717). Individual patches and detailed analysis available at: https://github.com/zbowling/mt7925 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/init.c b/drivers/net/wireless/mediatek/mt76/mt7925/init.c index d7d5afe365ed..f800112ccaf7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/init.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/init.c @@ -162,10 +162,17 @@ void mt7925_regd_update(struct mt792x_dev *dev) if (!dev->regd_change) return; - mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env); + if (mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env) < 0) + dev_warn(dev->mt76.dev, "Failed to set CLC\n"); + mt7925_regd_channel_update(wiphy, dev); - mt7925_mcu_set_channel_domain(hw->priv); - mt7925_set_tx_sar_pwr(hw, NULL); + + if (mt7925_mcu_set_channel_domain(hw->priv) < 0) + dev_warn(dev->mt76.dev, "Failed to set channel domain\n"); + + if (mt7925_set_tx_sar_pwr(hw, NULL) < 0) + dev_warn(dev->mt76.dev, "Failed to set TX SAR power\n"); + dev->regd_change = false; } EXPORT_SYMBOL_GPL(mt7925_regd_update); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 1e44e96f034e..a4109dc72163 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1270,6 +1270,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); @@ -1324,9 +1330,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt792x_mutex_acquire(dev); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index ac3d485a2f78..b6e3002faf41 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -596,6 +596,17 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; + return -EINVAL; + } + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -625,8 +636,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) @@ -743,9 +756,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } @@ -848,12 +863,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -879,15 +899,20 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && @@ -985,18 +1010,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; + + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; + + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1024,6 +1060,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1033,12 +1071,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1085,6 +1124,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1098,10 +1139,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1109,6 +1152,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -1247,22 +1291,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1271,8 +1315,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); @@ -1293,12 +1338,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) return; - mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } - mt792x_mutex_release(dev); } void mt7925_mlo_pm_work(struct work_struct *work) @@ -1307,9 +1352,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) mlo_pm_work.work); struct ieee80211_hw *hw = mt76_hw(dev); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_mlo_pm_iter, dev); + mt792x_mutex_release(dev); } static bool is_valid_alpha2(const char *alpha2) @@ -1645,6 +1692,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1706,6 +1755,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; @@ -1876,6 +1928,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1891,6 +1945,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } @@ -1912,7 +1968,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2033,6 +2094,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); @@ -2113,9 +2179,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2144,6 +2215,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 8eda407e4135..cf38e36790e7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1722,6 +1722,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1800,6 +1804,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index 8eb1fe1082d1..b6c90c5f7e91 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -454,7 +454,9 @@ static int mt7925_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7925_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) @@ -581,10 +583,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) } /* restore previous ds setting */ + mt792x_mutex_acquire(dev); if (!pm->ds_enable) mt7925_mcu_set_deep_sleep(dev, false); mt7925_regd_update(dev); + mt792x_mutex_release(dev); failed: pm->suspended = false; diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index 9cad572c34a3..0170a23b0529 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, IEEE80211_TX_CTRL_MLO_LINK); sta = (struct mt792x_sta *)control->sta->drv_priv; mlink = mt792x_sta_to_link(sta, link_id); + if (!mlink) + goto free_skb; wcid = &mlink->wcid; } @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, link_id = wcid->link_id; rcu_read_lock(); conf = rcu_dereference(vif->link_conf[link_id]); - memcpy(hdr->addr2, conf->addr, ETH_ALEN); - link_sta = rcu_dereference(control->sta->link[link_id]); + if (!conf || !link_sta) { + rcu_read_unlock(); + goto free_skb; + } + memcpy(hdr->addr2, conf->addr, ETH_ALEN); memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); if (vif->type == NL80211_IFTYPE_STATION) @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, } mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); + return; + +free_skb: + ieee80211_free_txskb(hw, skb); } EXPORT_SYMBOL_GPL(mt792x_tx); ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH] wifi: mt76: mt7925: comprehensive stability fixes 2026-01-02 20:05 ` [PATCH] wifi: mt76: mt7925: comprehensive stability fixes Zac Bowling @ 2026-01-03 6:25 ` Sean Wang 2026-01-03 19:11 ` Zac Bowling 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling 1 sibling, 1 reply; 113+ messages in thread From: Sean Wang @ 2026-01-03 6:25 UTC (permalink / raw) To: Zac Bowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi Zac, Thanks for the extensive work digging into the MT7925 stability issues the problems you’re addressing are real and definitely worth fixing. For upstream review, it would help a lot to align with a few common practices: 1) One patch should handle one issue. Splitting this into smaller, self-contained patches makes review easier and allows safe reverts. 2) For fixes of runtime failures (panic, NULL deref, hangs), please include the relevant dmesg or crash log in the commit message so reviewers and downstream users can clearly see the failure being addressed and determine whether they are hitting the same issue. 3) If a fix comes from static analysis (e.g. clang static analyzer), that’s perfectly fine, just mention it in the commit message and briefly explain why the state or pointer can be invalid at runtime. 4) For review, it would also be helpful to aggregate the fixes from v1, v2, and this one into a clean v3 series based on the current wireless tree (https://github.com/nbd168/wireless.git). Sean On Fri, Jan 2, 2026 at 2:05 PM Zac Bowling <zbowling@gmail.com> wrote: > > From: Zac Bowling <zac@zacbowling.com> > > This unified patch combines all MT7925 driver fixes for kernel stability: > > 1. NULL pointer dereference fixes in vif iteration, TX path, and MCU functions > 2. Missing mutex protection in reset, ROC, PM, and resume paths > 3. Error handling for MCU commands (AMPDU, BSS info, key setup) > 4. lockdep assertions for debugging > 5. MLO (Multi-Link Operation) improvements for roaming and AP mode > 6. Firmware reload recovery after crashes > > These fixes address kernel panics and system hangs that occur during: > - WiFi network switching and BSSID roaming > - Suspend/resume cycles > - MLO link state transitions > - Firmware recovery after crashes > > Tested on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 (RZ717). > > Individual patches and detailed analysis available at: > https://github.com/zbowling/mt7925 > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/init.c b/drivers/net/wireless/mediatek/mt76/mt7925/init.c > index d7d5afe365ed..f800112ccaf7 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/init.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/init.c > @@ -162,10 +162,17 @@ void mt7925_regd_update(struct mt792x_dev *dev) > if (!dev->regd_change) > return; > > - mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env); > + if (mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env) < 0) > + dev_warn(dev->mt76.dev, "Failed to set CLC\n"); > + > mt7925_regd_channel_update(wiphy, dev); > - mt7925_mcu_set_channel_domain(hw->priv); > - mt7925_set_tx_sar_pwr(hw, NULL); > + > + if (mt7925_mcu_set_channel_domain(hw->priv) < 0) > + dev_warn(dev->mt76.dev, "Failed to set channel domain\n"); > + > + if (mt7925_set_tx_sar_pwr(hw, NULL) < 0) > + dev_warn(dev->mt76.dev, "Failed to set TX SAR power\n"); > + > dev->regd_change = false; > } > EXPORT_SYMBOL_GPL(mt7925_regd_update); > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > index 1e44e96f034e..a4109dc72163 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > @@ -1270,6 +1270,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, > bss_conf = mt792x_vif_to_bss_conf(vif, i); > mconf = mt792x_vif_to_link(mvif, i); > > + /* Skip links that don't have bss_conf set up yet in mac80211. > + * This can happen during HW reset when link state is inconsistent. > + */ > + if (!bss_conf) > + continue; > + > mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, > &mvif->sta.deflink.wcid, true); > mt7925_mcu_set_tx(dev, bss_conf); > @@ -1324,9 +1330,11 @@ void mt7925_mac_reset_work(struct work_struct *work) > dev->hw_full_reset = false; > pm->suspended = false; > ieee80211_wake_queues(hw); > + mt792x_mutex_acquire(dev); > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7925_vif_connect_iter, NULL); > + mt792x_mutex_release(dev); > mt76_connac_power_save_sched(&dev->mt76.phy, pm); > > mt792x_mutex_acquire(dev); > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > index ac3d485a2f78..b6e3002faf41 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > @@ -596,6 +596,17 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, > link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; > mconf = mt792x_vif_to_link(mvif, link_id); > mlink = mt792x_sta_to_link(msta, link_id); > + > + if (!link_conf || !mconf || !mlink) { > + /* During MLO roaming, link state may be torn down before > + * mac80211 requests key removal. If removing a key and > + * the link is already gone, consider it successfully removed. > + */ > + if (cmd != SET_KEY) > + return 0; > + return -EINVAL; > + } > + > wcid = &mlink->wcid; > wcid_keyidx = &wcid->hw_key_idx; > > @@ -625,8 +636,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, > struct mt792x_phy *phy = mt792x_hw_phy(hw); > > mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); > - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, > - link_sta, true); > + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, > + link_sta, true); > + if (err) > + goto out; > } > > if (cmd == SET_KEY) > @@ -743,9 +756,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) > bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); > > pm->enable = pm->enable_user && !monitor; > + mt792x_mutex_acquire(dev); > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7925_pm_interface_iter, dev); > + mt792x_mutex_release(dev); > pm->ds_enable = pm->ds_enable_user && !monitor; > mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); > } > @@ -848,12 +863,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > mlink = mt792x_sta_to_link(msta, link_id); > + if (!mlink) > + return -EINVAL; > > idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > if (idx < 0) > return -ENOSPC; > > mconf = mt792x_vif_to_link(mvif, link_id); > + if (!mconf) > + return -EINVAL; > + > mt76_wcid_init(&mlink->wcid, 0); > mlink->wcid.sta = 1; > mlink->wcid.idx = idx; > @@ -879,15 +899,20 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > MT_WTBL_UPDATE_ADM_COUNT_CLEAR); > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > + if (!link_conf) > + return -EINVAL; > > /* should update bss info before STA add */ > if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > if (ieee80211_vif_is_mld(vif)) > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > - link_conf, link_sta, link_sta != mlink->pri_link); > + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > + link_conf, link_sta, > + link_sta != mlink->pri_link); > else > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > - link_conf, link_sta, false); > + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > + link_conf, link_sta, false); > + if (ret) > + return ret; > } > > if (ieee80211_vif_is_mld(vif) && > @@ -985,18 +1010,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) > { > struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); > struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; > - struct ieee80211_bss_conf *link_conf = > - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; > - enum nl80211_band band = chandef->chan->band, secondary_band; > + struct ieee80211_bss_conf *link_conf; > + struct cfg80211_chan_def *chandef; > + enum nl80211_band band, secondary_band; > + u16 sel_links; > + u8 secondary_link_id; > + > + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > + if (!link_conf) > + return; > + > + chandef = &link_conf->chanreq.oper; > + band = chandef->chan->band; > > - u16 sel_links = mt76_select_links(vif, 2); > - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); > + sel_links = mt76_select_links(vif, 2); > + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); > > if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) > return; > > link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); > + if (!link_conf) > + return; > + > secondary_band = link_conf->chanreq.oper.chan->band; > > if (band == NL80211_BAND_2GHZ || > @@ -1024,6 +1060,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > mlink = mt792x_sta_to_link(msta, link_sta->link_id); > + if (!mlink) > + return; > > mt792x_mutex_acquire(dev); > > @@ -1033,12 +1071,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, > link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); > } > > - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > struct mt792x_bss_conf *mconf; > > mconf = mt792x_link_conf_to_mconf(link_conf); > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > - link_conf, link_sta, true); > + if (mconf) > + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > + link_conf, link_sta, true); > } > > ewma_avg_signal_init(&mlink->avg_ack_signal); > @@ -1085,6 +1124,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > mlink = mt792x_sta_to_link(msta, link_id); > + if (!mlink) > + return; > > mt7925_roc_abort_sync(dev); > > @@ -1098,10 +1139,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > struct mt792x_bss_conf *mconf; > > mconf = mt792x_link_conf_to_mconf(link_conf); > + if (!mconf) > + goto out; > > if (ieee80211_vif_is_mld(vif)) > mt792x_mac_link_bss_remove(dev, mconf, mlink); > @@ -1109,6 +1152,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, > link_sta, false); > } > +out: > > spin_lock_bh(&mdev->sta_poll_lock); > if (!list_empty(&mlink->wcid.poll_list)) > @@ -1247,22 +1291,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > case IEEE80211_AMPDU_RX_START: > mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, > params->buf_size); > - mt7925_mcu_uni_rx_ba(dev, params, true); > + ret = mt7925_mcu_uni_rx_ba(dev, params, true); > break; > case IEEE80211_AMPDU_RX_STOP: > mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); > - mt7925_mcu_uni_rx_ba(dev, params, false); > + ret = mt7925_mcu_uni_rx_ba(dev, params, false); > break; > case IEEE80211_AMPDU_TX_OPERATIONAL: > mtxq->aggr = true; > mtxq->send_bar = false; > - mt7925_mcu_uni_tx_ba(dev, params, true); > + ret = mt7925_mcu_uni_tx_ba(dev, params, true); > break; > case IEEE80211_AMPDU_TX_STOP_FLUSH: > case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: > mtxq->aggr = false; > clear_bit(tid, &msta->deflink.wcid.ampdu_state); > - mt7925_mcu_uni_tx_ba(dev, params, false); > + ret = mt7925_mcu_uni_tx_ba(dev, params, false); > break; > case IEEE80211_AMPDU_TX_START: > set_bit(tid, &msta->deflink.wcid.ampdu_state); > @@ -1271,8 +1315,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > case IEEE80211_AMPDU_TX_STOP_CONT: > mtxq->aggr = false; > clear_bit(tid, &msta->deflink.wcid.ampdu_state); > - mt7925_mcu_uni_tx_ba(dev, params, false); > - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); > + ret = mt7925_mcu_uni_tx_ba(dev, params, false); > + if (!ret) > + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); > break; > } > mt792x_mutex_release(dev); > @@ -1293,12 +1338,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) > if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) > return; > > - mt792x_mutex_acquire(dev); > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > bss_conf = mt792x_vif_to_bss_conf(vif, i); > + if (!bss_conf) > + continue; > mt7925_mcu_uni_bss_ps(dev, bss_conf); > } > - mt792x_mutex_release(dev); > } > > void mt7925_mlo_pm_work(struct work_struct *work) > @@ -1307,9 +1352,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) > mlo_pm_work.work); > struct ieee80211_hw *hw = mt76_hw(dev); > > + mt792x_mutex_acquire(dev); > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7925_mlo_pm_iter, dev); > + mt792x_mutex_release(dev); > } > > static bool is_valid_alpha2(const char *alpha2) > @@ -1645,6 +1692,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, > > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > bss_conf = mt792x_vif_to_bss_conf(vif, i); > + if (!bss_conf) > + continue; > __mt7925_ipv6_addr_change(hw, bss_conf, idev); > } > } > @@ -1706,6 +1755,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > [IEEE80211_AC_BK] = 1, > }; > > + if (!mconf) > + return -EINVAL; > + > /* firmware uses access class index */ > mconf->queue_params[mq_to_aci[queue]] = *params; > > @@ -1876,6 +1928,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, > if (changed & BSS_CHANGED_ARP_FILTER) { > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > bss_conf = mt792x_vif_to_bss_conf(vif, i); > + if (!bss_conf) > + continue; > mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); > } > } > @@ -1891,6 +1945,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, > } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > bss_conf = mt792x_vif_to_bss_conf(vif, i); > + if (!bss_conf) > + continue; > mt7925_mcu_uni_bss_ps(dev, bss_conf); > } > } > @@ -1912,7 +1968,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, > struct ieee80211_bss_conf *link_conf; > > mconf = mt792x_vif_to_link(mvif, info->link_id); > + if (!mconf) > + return; > + > link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); > + if (!link_conf) > + return; > > mt792x_mutex_acquire(dev); > > @@ -2033,6 +2094,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > mlink = mlinks[link_id]; > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > + if (!link_conf) { > + err = -EINVAL; > + goto free; > + } > + > rcu_assign_pointer(mvif->link_conf[link_id], mconf); > rcu_assign_pointer(mvif->sta.link[link_id], mlink); > > @@ -2113,9 +2179,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, > > if (ieee80211_vif_is_mld(vif)) { > mconf = mt792x_vif_to_link(mvif, link_conf->link_id); > + if (!mconf) { > + mutex_unlock(&dev->mt76.mutex); > + return -EINVAL; > + } > + > pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > > - if (vif->type == NL80211_IFTYPE_STATION && > + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && > mconf == &mvif->bss_conf) > mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, > NULL, true); > @@ -2144,6 +2215,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, > > if (ieee80211_vif_is_mld(vif)) { > mconf = mt792x_vif_to_link(mvif, link_conf->link_id); > + if (!mconf) { > + mutex_unlock(&dev->mt76.mutex); > + return; > + } > > if (vif->type == NL80211_IFTYPE_STATION && > mconf == &mvif->bss_conf) > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > index 8eda407e4135..cf38e36790e7 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > @@ -1722,6 +1722,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, > > link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); > mconf = mt792x_vif_to_link(mvif, link_sta->link_id); > + > + if (!link_conf || !mconf) > + return; > + > chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : > &link_conf->chanreq.oper; > > @@ -1800,6 +1804,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, > > link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); > mconf = mt792x_vif_to_link(mvif, link_sta->link_id); > + > + if (!link_conf || !mconf) > + return; > + > chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : > &link_conf->chanreq.oper; > band = chandef->chan->band; > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > index 8eb1fe1082d1..b6c90c5f7e91 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > @@ -454,7 +454,9 @@ static int mt7925_pci_suspend(struct device *device) > cancel_delayed_work_sync(&pm->ps_work); > cancel_work_sync(&pm->wake_work); > > + mt792x_mutex_acquire(dev); > mt7925_roc_abort_sync(dev); > + mt792x_mutex_release(dev); > > err = mt792x_mcu_drv_pmctrl(dev); > if (err < 0) > @@ -581,10 +583,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) > } > > /* restore previous ds setting */ > + mt792x_mutex_acquire(dev); > if (!pm->ds_enable) > mt7925_mcu_set_deep_sleep(dev, false); > > mt7925_regd_update(dev); > + mt792x_mutex_release(dev); > failed: > pm->suspended = false; > > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > index 9cad572c34a3..0170a23b0529 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c > +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > IEEE80211_TX_CTRL_MLO_LINK); > sta = (struct mt792x_sta *)control->sta->drv_priv; > mlink = mt792x_sta_to_link(sta, link_id); > + if (!mlink) > + goto free_skb; > wcid = &mlink->wcid; > } > > @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > link_id = wcid->link_id; > rcu_read_lock(); > conf = rcu_dereference(vif->link_conf[link_id]); > - memcpy(hdr->addr2, conf->addr, ETH_ALEN); > - > link_sta = rcu_dereference(control->sta->link[link_id]); > + if (!conf || !link_sta) { > + rcu_read_unlock(); > + goto free_skb; > + } > + memcpy(hdr->addr2, conf->addr, ETH_ALEN); > memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); > > if (vif->type == NL80211_IFTYPE_STATION) > @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > } > > mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); > + return; > + > +free_skb: > + ieee80211_free_txskb(hw, skb); > } > EXPORT_SYMBOL_GPL(mt792x_tx); > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH] wifi: mt76: mt7925: comprehensive stability fixes 2026-01-03 6:25 ` Sean Wang @ 2026-01-03 19:11 ` Zac Bowling 0 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-03 19:11 UTC (permalink / raw) To: Sean Wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi Sean, 1) I have some 17 smaller patches that got into more detail but didn't want to spam. When sending here to LKMS I squashed that down into about 7 initially that I have already sent in the last 3 days. I wasn't sure if a fully squashed patch would make it easier to review, since it's all related to a similar class of bug, which this patch is. Most of this stemmed from null deref crash I ran into in a few places, either during re-auth attempts that failed from a bad key or from other races in state changes. When I was investigating searching google I found a half a dozen other similar crashes posted on forums from folks that hit similar things but no correct solution. Just folks doing hacks. That's when I wrote a little stress test tool that if you have the right AP environment like two APs with the same SSID with Wifi 7 and MLO enabled you can trigger different races with some hammering and get a consistent repro case. I'm using 3 Ubiquiti 7 Pro APs and it's been panic city over here the last month which got me motivated enough to investigate over my holiday break. Some of these null and error return checks are purely defensive additions to prevent future regressions. 2) I have a folder full of dumps after my stress tests to repro :) They all look similar to this but the exact null defref is not always in the same place. All from after ieee80211_iterate_interfaces so my patches mostly work to check locks in or around that call. The deadlocks I have some dmesg logs but that aren't too interesting. [ 655.737302] [ T12] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 655.737320] [ T12] #PF: supervisor read access in kernel mode [ 655.737324] [ T12] #PF: error_code(0x0000) - not-present page [ 655.737328] [ T12] PGD 0 P4D 0 [ 655.737334] [ T12] Oops: Oops: 0000 [#1] SMP NOPTI [ 655.737342] [ T12] CPU: 20 UID: 0 PID: 12 Comm: kworker/u128:0 Kdump: loaded Tainted: G OE 6.17.0-8-generic #8-Ubuntu PREEMPT(voluntary) [ 655.737350] [ T12] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 655.737351] [ T12] Hardware name: Framework Desktop (AMD Ryzen AI Max 300 Series)/FRANMFCP06, BIOS 03.04 11/19/2025 [ 655.737354] [ T12] Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] [ 655.737370] [ T12] RIP: 0010:mt76_connac_mcu_uni_add_dev+0xba/0x1f0 [mt76_connac_lib] [ 655.737385] [ T12] Code: cc 66 44 89 5d d2 44 88 45 d4 44 88 4d d5 88 65 d7 c6 45 dc 01 88 55 dd 0f b7 97 b8 00 00 00 88 4d ef 66 89 55 e4 66 89 55 ea <48> 8b 16 8b 12 83 fa 03 0f 84 0c 01 00 00 77 1b 83 fa 01 0f 84 f5 [ 655.737388] [ T12] RSP: 0018:ffffd07fc018fcb0 EFLAGS: 00010282 [ 655.737392] [ T12] RAX: 000000000000ff00 RBX: ffff8a4449442040 RCX: 0000000000000000 [ 655.737394] [ T12] RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8a44c7d7a4b0 [ 655.737396] [ T12] RBP: ffffd07fc018fcf8 R08: 0000000000000001 R09: 0000000000000000 [ 655.737397] [ T12] R10: 0000000000000000 R11: 0000000000000020 R12: ffff8a4449442040 [ 655.737399] [ T12] R13: ffff8a44c7d79f08 R14: 0000000000000000 R15: ffff8a44c7d78a80 [ 655.737401] [ T12] FS: 0000000000000000(0000) GS:ffff8a53ca47f000(0000) knlGS:0000000000000000 [ 655.737403] [ T12] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 655.737404] [ T12] CR2: 0000000000000000 CR3: 00000009e2a40000 CR4: 0000000000f50ef0 [ 655.737406] [ T12] PKRU: 55555554 [ 655.737408] [ T12] Call Trace: [ 655.737411] [ T12] [ 655.737416] [ T12] mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] [ 655.737423] [ T12] __iterate_interfaces+0x92/0x130 [mac80211] [ 655.737500] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common] [ 655.737506] [ T12] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] [ 655.737549] [ T12] ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common] [ 655.737553] [ T12] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] [ 655.737559] [ T12] process_one_work+0x18b/0x370 [ 655.737567] [ T12] worker_thread+0x317/0x450 [ 655.737570] [ T12] ? __pfx_worker_thread+0x10/0x10 [ 655.737573] [ T12] kthread+0x108/0x220 [ 655.737577] [ T12] ? __pfx_kthread+0x10/0x10 [ 655.737579] [ T12] ret_from_fork+0x131/0x150 [ 655.737585] [ T12] ? __pfx_kthread+0x10/0x10 [ 655.737587] [ T12] ret_from_fork_asm+0x1a/0x30 3) Yeah, some of my later patches came from static analysis (clang-tidy, etc) but also AI, mostly looking for additional cases I missed since I'm new to this code. Some of the patches were actually porting over changes made for other MT chipsets not applied to this one that seemed relevant. I'll try to include more details. This is all just an attempt to get my personal machine stable overnight so I don't have to run ethernet and just blacklist the driver :) 4) Sounds good! Will do! Zac Bowling On Fri, Jan 2, 2026 at 10:26 PM Sean Wang <sean.wang@kernel.org> wrote: > > Hi Zac, > > Thanks for the extensive work digging into the MT7925 stability issues > the problems you’re addressing are real and definitely worth fixing. > > For upstream review, it would help a lot to align with a few common practices: > > 1) One patch should handle one issue. Splitting this into smaller, > self-contained patches makes review easier and allows safe reverts. > > 2) For fixes of runtime failures (panic, NULL deref, hangs), please include > the relevant dmesg or crash log in the commit message so reviewers and > downstream users can clearly see the failure being addressed and > determine whether they are hitting the same issue. > > 3) If a fix comes from static analysis (e.g. clang static analyzer), that’s > perfectly fine, just mention it in the commit message and briefly explain > why the state or pointer can be invalid at runtime. > > 4) For review, it would also be helpful to aggregate the fixes from v1, v2, > and this one into a clean v3 series based on the current wireless > tree (https://github.com/nbd168/wireless.git). > > Sean > > On Fri, Jan 2, 2026 at 2:05 PM Zac Bowling <zbowling@gmail.com> wrote: > > > > From: Zac Bowling <zac@zacbowling.com> > > > > This unified patch combines all MT7925 driver fixes for kernel stability: > > > > 1. NULL pointer dereference fixes in vif iteration, TX path, and MCU functions > > 2. Missing mutex protection in reset, ROC, PM, and resume paths > > 3. Error handling for MCU commands (AMPDU, BSS info, key setup) > > 4. lockdep assertions for debugging > > 5. MLO (Multi-Link Operation) improvements for roaming and AP mode > > 6. Firmware reload recovery after crashes > > > > These fixes address kernel panics and system hangs that occur during: > > - WiFi network switching and BSSID roaming > > - Suspend/resume cycles > > - MLO link state transitions > > - Firmware recovery after crashes > > > > Tested on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 (RZ717). > > > > Individual patches and detailed analysis available at: > > https://github.com/zbowling/mt7925 > > > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > --- > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/init.c b/drivers/net/wireless/mediatek/mt76/mt7925/init.c > > index d7d5afe365ed..f800112ccaf7 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/init.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/init.c > > @@ -162,10 +162,17 @@ void mt7925_regd_update(struct mt792x_dev *dev) > > if (!dev->regd_change) > > return; > > > > - mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env); > > + if (mt7925_mcu_set_clc(dev, mdev->alpha2, dev->country_ie_env) < 0) > > + dev_warn(dev->mt76.dev, "Failed to set CLC\n"); > > + > > mt7925_regd_channel_update(wiphy, dev); > > - mt7925_mcu_set_channel_domain(hw->priv); > > - mt7925_set_tx_sar_pwr(hw, NULL); > > + > > + if (mt7925_mcu_set_channel_domain(hw->priv) < 0) > > + dev_warn(dev->mt76.dev, "Failed to set channel domain\n"); > > + > > + if (mt7925_set_tx_sar_pwr(hw, NULL) < 0) > > + dev_warn(dev->mt76.dev, "Failed to set TX SAR power\n"); > > + > > dev->regd_change = false; > > } > > EXPORT_SYMBOL_GPL(mt7925_regd_update); > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > > index 1e44e96f034e..a4109dc72163 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c > > @@ -1270,6 +1270,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, > > bss_conf = mt792x_vif_to_bss_conf(vif, i); > > mconf = mt792x_vif_to_link(mvif, i); > > > > + /* Skip links that don't have bss_conf set up yet in mac80211. > > + * This can happen during HW reset when link state is inconsistent. > > + */ > > + if (!bss_conf) > > + continue; > > + > > mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, > > &mvif->sta.deflink.wcid, true); > > mt7925_mcu_set_tx(dev, bss_conf); > > @@ -1324,9 +1330,11 @@ void mt7925_mac_reset_work(struct work_struct *work) > > dev->hw_full_reset = false; > > pm->suspended = false; > > ieee80211_wake_queues(hw); > > + mt792x_mutex_acquire(dev); > > ieee80211_iterate_active_interfaces(hw, > > IEEE80211_IFACE_ITER_RESUME_ALL, > > mt7925_vif_connect_iter, NULL); > > + mt792x_mutex_release(dev); > > mt76_connac_power_save_sched(&dev->mt76.phy, pm); > > > > mt792x_mutex_acquire(dev); > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > index ac3d485a2f78..b6e3002faf41 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > @@ -596,6 +596,17 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, > > link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; > > mconf = mt792x_vif_to_link(mvif, link_id); > > mlink = mt792x_sta_to_link(msta, link_id); > > + > > + if (!link_conf || !mconf || !mlink) { > > + /* During MLO roaming, link state may be torn down before > > + * mac80211 requests key removal. If removing a key and > > + * the link is already gone, consider it successfully removed. > > + */ > > + if (cmd != SET_KEY) > > + return 0; > > + return -EINVAL; > > + } > > + > > wcid = &mlink->wcid; > > wcid_keyidx = &wcid->hw_key_idx; > > > > @@ -625,8 +636,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, > > struct mt792x_phy *phy = mt792x_hw_phy(hw); > > > > mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); > > - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, > > - link_sta, true); > > + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, > > + link_sta, true); > > + if (err) > > + goto out; > > } > > > > if (cmd == SET_KEY) > > @@ -743,9 +756,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) > > bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); > > > > pm->enable = pm->enable_user && !monitor; > > + mt792x_mutex_acquire(dev); > > ieee80211_iterate_active_interfaces(hw, > > IEEE80211_IFACE_ITER_RESUME_ALL, > > mt7925_pm_interface_iter, dev); > > + mt792x_mutex_release(dev); > > pm->ds_enable = pm->ds_enable_user && !monitor; > > mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); > > } > > @@ -848,12 +863,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > > mlink = mt792x_sta_to_link(msta, link_id); > > + if (!mlink) > > + return -EINVAL; > > > > idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > > if (idx < 0) > > return -ENOSPC; > > > > mconf = mt792x_vif_to_link(mvif, link_id); > > + if (!mconf) > > + return -EINVAL; > > + > > mt76_wcid_init(&mlink->wcid, 0); > > mlink->wcid.sta = 1; > > mlink->wcid.idx = idx; > > @@ -879,15 +899,20 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > MT_WTBL_UPDATE_ADM_COUNT_CLEAR); > > > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > + if (!link_conf) > > + return -EINVAL; > > > > /* should update bss info before STA add */ > > if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > if (ieee80211_vif_is_mld(vif)) > > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > - link_conf, link_sta, link_sta != mlink->pri_link); > > + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > + link_conf, link_sta, > > + link_sta != mlink->pri_link); > > else > > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > - link_conf, link_sta, false); > > + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > + link_conf, link_sta, false); > > + if (ret) > > + return ret; > > } > > > > if (ieee80211_vif_is_mld(vif) && > > @@ -985,18 +1010,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) > > { > > struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); > > struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; > > - struct ieee80211_bss_conf *link_conf = > > - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > > - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; > > - enum nl80211_band band = chandef->chan->band, secondary_band; > > + struct ieee80211_bss_conf *link_conf; > > + struct cfg80211_chan_def *chandef; > > + enum nl80211_band band, secondary_band; > > + u16 sel_links; > > + u8 secondary_link_id; > > + > > + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > > + if (!link_conf) > > + return; > > + > > + chandef = &link_conf->chanreq.oper; > > + band = chandef->chan->band; > > > > - u16 sel_links = mt76_select_links(vif, 2); > > - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); > > + sel_links = mt76_select_links(vif, 2); > > + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); > > > > if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) > > return; > > > > link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); > > + if (!link_conf) > > + return; > > + > > secondary_band = link_conf->chanreq.oper.chan->band; > > > > if (band == NL80211_BAND_2GHZ || > > @@ -1024,6 +1060,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, > > > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > > mlink = mt792x_sta_to_link(msta, link_sta->link_id); > > + if (!mlink) > > + return; > > > > mt792x_mutex_acquire(dev); > > > > @@ -1033,12 +1071,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, > > link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); > > } > > > > - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > struct mt792x_bss_conf *mconf; > > > > mconf = mt792x_link_conf_to_mconf(link_conf); > > - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > - link_conf, link_sta, true); > > + if (mconf) > > + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > + link_conf, link_sta, true); > > } > > > > ewma_avg_signal_init(&mlink->avg_ack_signal); > > @@ -1085,6 +1124,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > > mlink = mt792x_sta_to_link(msta, link_id); > > + if (!mlink) > > + return; > > > > mt7925_roc_abort_sync(dev); > > > > @@ -1098,10 +1139,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > > > - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > struct mt792x_bss_conf *mconf; > > > > mconf = mt792x_link_conf_to_mconf(link_conf); > > + if (!mconf) > > + goto out; > > > > if (ieee80211_vif_is_mld(vif)) > > mt792x_mac_link_bss_remove(dev, mconf, mlink); > > @@ -1109,6 +1152,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, > > link_sta, false); > > } > > +out: > > > > spin_lock_bh(&mdev->sta_poll_lock); > > if (!list_empty(&mlink->wcid.poll_list)) > > @@ -1247,22 +1291,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > case IEEE80211_AMPDU_RX_START: > > mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, > > params->buf_size); > > - mt7925_mcu_uni_rx_ba(dev, params, true); > > + ret = mt7925_mcu_uni_rx_ba(dev, params, true); > > break; > > case IEEE80211_AMPDU_RX_STOP: > > mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); > > - mt7925_mcu_uni_rx_ba(dev, params, false); > > + ret = mt7925_mcu_uni_rx_ba(dev, params, false); > > break; > > case IEEE80211_AMPDU_TX_OPERATIONAL: > > mtxq->aggr = true; > > mtxq->send_bar = false; > > - mt7925_mcu_uni_tx_ba(dev, params, true); > > + ret = mt7925_mcu_uni_tx_ba(dev, params, true); > > break; > > case IEEE80211_AMPDU_TX_STOP_FLUSH: > > case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: > > mtxq->aggr = false; > > clear_bit(tid, &msta->deflink.wcid.ampdu_state); > > - mt7925_mcu_uni_tx_ba(dev, params, false); > > + ret = mt7925_mcu_uni_tx_ba(dev, params, false); > > break; > > case IEEE80211_AMPDU_TX_START: > > set_bit(tid, &msta->deflink.wcid.ampdu_state); > > @@ -1271,8 +1315,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > case IEEE80211_AMPDU_TX_STOP_CONT: > > mtxq->aggr = false; > > clear_bit(tid, &msta->deflink.wcid.ampdu_state); > > - mt7925_mcu_uni_tx_ba(dev, params, false); > > - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); > > + ret = mt7925_mcu_uni_tx_ba(dev, params, false); > > + if (!ret) > > + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); > > break; > > } > > mt792x_mutex_release(dev); > > @@ -1293,12 +1338,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) > > if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) > > return; > > > > - mt792x_mutex_acquire(dev); > > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > > bss_conf = mt792x_vif_to_bss_conf(vif, i); > > + if (!bss_conf) > > + continue; > > mt7925_mcu_uni_bss_ps(dev, bss_conf); > > } > > - mt792x_mutex_release(dev); > > } > > > > void mt7925_mlo_pm_work(struct work_struct *work) > > @@ -1307,9 +1352,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) > > mlo_pm_work.work); > > struct ieee80211_hw *hw = mt76_hw(dev); > > > > + mt792x_mutex_acquire(dev); > > ieee80211_iterate_active_interfaces(hw, > > IEEE80211_IFACE_ITER_RESUME_ALL, > > mt7925_mlo_pm_iter, dev); > > + mt792x_mutex_release(dev); > > } > > > > static bool is_valid_alpha2(const char *alpha2) > > @@ -1645,6 +1692,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, > > > > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > > bss_conf = mt792x_vif_to_bss_conf(vif, i); > > + if (!bss_conf) > > + continue; > > __mt7925_ipv6_addr_change(hw, bss_conf, idev); > > } > > } > > @@ -1706,6 +1755,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > [IEEE80211_AC_BK] = 1, > > }; > > > > + if (!mconf) > > + return -EINVAL; > > + > > /* firmware uses access class index */ > > mconf->queue_params[mq_to_aci[queue]] = *params; > > > > @@ -1876,6 +1928,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, > > if (changed & BSS_CHANGED_ARP_FILTER) { > > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > > bss_conf = mt792x_vif_to_bss_conf(vif, i); > > + if (!bss_conf) > > + continue; > > mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); > > } > > } > > @@ -1891,6 +1945,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, > > } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { > > for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { > > bss_conf = mt792x_vif_to_bss_conf(vif, i); > > + if (!bss_conf) > > + continue; > > mt7925_mcu_uni_bss_ps(dev, bss_conf); > > } > > } > > @@ -1912,7 +1968,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, > > struct ieee80211_bss_conf *link_conf; > > > > mconf = mt792x_vif_to_link(mvif, info->link_id); > > + if (!mconf) > > + return; > > + > > link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); > > + if (!link_conf) > > + return; > > > > mt792x_mutex_acquire(dev); > > > > @@ -2033,6 +2094,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > mlink = mlinks[link_id]; > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > > > + if (!link_conf) { > > + err = -EINVAL; > > + goto free; > > + } > > + > > rcu_assign_pointer(mvif->link_conf[link_id], mconf); > > rcu_assign_pointer(mvif->sta.link[link_id], mlink); > > > > @@ -2113,9 +2179,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, > > > > if (ieee80211_vif_is_mld(vif)) { > > mconf = mt792x_vif_to_link(mvif, link_conf->link_id); > > + if (!mconf) { > > + mutex_unlock(&dev->mt76.mutex); > > + return -EINVAL; > > + } > > + > > pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); > > > > - if (vif->type == NL80211_IFTYPE_STATION && > > + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && > > mconf == &mvif->bss_conf) > > mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, > > NULL, true); > > @@ -2144,6 +2215,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, > > > > if (ieee80211_vif_is_mld(vif)) { > > mconf = mt792x_vif_to_link(mvif, link_conf->link_id); > > + if (!mconf) { > > + mutex_unlock(&dev->mt76.mutex); > > + return; > > + } > > > > if (vif->type == NL80211_IFTYPE_STATION && > > mconf == &mvif->bss_conf) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > > index 8eda407e4135..cf38e36790e7 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c > > @@ -1722,6 +1722,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, > > > > link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); > > mconf = mt792x_vif_to_link(mvif, link_sta->link_id); > > + > > + if (!link_conf || !mconf) > > + return; > > + > > chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : > > &link_conf->chanreq.oper; > > > > @@ -1800,6 +1804,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, > > > > link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); > > mconf = mt792x_vif_to_link(mvif, link_sta->link_id); > > + > > + if (!link_conf || !mconf) > > + return; > > + > > chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : > > &link_conf->chanreq.oper; > > band = chandef->chan->band; > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > > index 8eb1fe1082d1..b6c90c5f7e91 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c > > @@ -454,7 +454,9 @@ static int mt7925_pci_suspend(struct device *device) > > cancel_delayed_work_sync(&pm->ps_work); > > cancel_work_sync(&pm->wake_work); > > > > + mt792x_mutex_acquire(dev); > > mt7925_roc_abort_sync(dev); > > + mt792x_mutex_release(dev); > > > > err = mt792x_mcu_drv_pmctrl(dev); > > if (err < 0) > > @@ -581,10 +583,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) > > } > > > > /* restore previous ds setting */ > > + mt792x_mutex_acquire(dev); > > if (!pm->ds_enable) > > mt7925_mcu_set_deep_sleep(dev, false); > > > > mt7925_regd_update(dev); > > + mt792x_mutex_release(dev); > > failed: > > pm->suspended = false; > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > > index 9cad572c34a3..0170a23b0529 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > > @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > > IEEE80211_TX_CTRL_MLO_LINK); > > sta = (struct mt792x_sta *)control->sta->drv_priv; > > mlink = mt792x_sta_to_link(sta, link_id); > > + if (!mlink) > > + goto free_skb; > > wcid = &mlink->wcid; > > } > > > > @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > > link_id = wcid->link_id; > > rcu_read_lock(); > > conf = rcu_dereference(vif->link_conf[link_id]); > > - memcpy(hdr->addr2, conf->addr, ETH_ALEN); > > - > > link_sta = rcu_dereference(control->sta->link[link_id]); > > + if (!conf || !link_sta) { > > + rcu_read_unlock(); > > + goto free_skb; > > + } > > + memcpy(hdr->addr2, conf->addr, ETH_ALEN); > > memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); > > > > if (vif->type == NL80211_IFTYPE_STATION) > > @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > > } > > > > mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); > > + return; > > + > > +free_skb: > > + ieee80211_free_txskb(hw, skb); > > } > > EXPORT_SYMBOL_GPL(mt792x_tx); > > > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes 2026-01-02 20:05 ` [PATCH] wifi: mt76: mt7925: comprehensive stability fixes Zac Bowling 2026-01-03 6:25 ` Sean Wang @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 01/17] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac Bowling ` (17 more replies) 1 sibling, 18 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang From: Zac Bowling <zac@zacbowling.com> This patch series addresses kernel panics, system deadlocks, and various stability issues in the MT7925 WiFi driver. The issues were discovered on kernel 6.17 (Ubuntu 25.10) and fixes were developed and tested on 6.18.2. These patches are based on the wireless tree (nbd168/wireless.git) as requested by Sean Wang. == Problem Description == The MT7925 driver has several bugs that cause: - Kernel NULL pointer dereferences during BSSID roaming - System-wide deadlocks requiring hard reboot - Firmware reload failures after suspend/resume - Key removal errors during MLO roaming These issues manifest approximately every 5 minutes when the adapter tries to switch to a better BSSID, particularly in enterprise environments with multiple access points. == Root Causes == 1. Missing mutex protection around ieee80211_iterate_active_interfaces() when the callback invokes MCU functions (patches 2, 3, 16) 2. NULL pointer dereferences where mt792x_vif_to_bss_conf(), mt792x_sta_to_link(), and similar functions return NULL during MLO state transitions but results are not checked (patches 1, 4, 5, 9, 10, 14, 17) 3. Ignored MCU return values hiding firmware errors (patches 6, 7, 8) 4. WARN_ON_ONCE used where NULL is expected during normal MLO AP setup (patch 13) 5. Firmware semaphore not released after failed load attempts (patch 15) 6. Key removal returning error when link is already torn down (patch 12) == Testing == Stress tested by hammering the driver with custom test script. Tested on: - Framework Desktop (AMD Ryzen AI Max 300 Series) with MT7925 (RZ717) - This whole patch series was tested on Kernel 6.18.2 and 6.17.12 (Ubuntu 25.10) - Enterprise WiFi environment with multiple WIFI 7 APs with MLO enabled Before patches: System hangs/panics every 5-15 minutes during BSSID roaming After patches: Stable for 24+ hours under continuous stress testing == Crash Traces Fixed == Primary NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000010 Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] RIP: 0010:mt76_connac_mcu_uni_add_dev+0x9c/0x780 [mt76_connac_lib] Call Trace: mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] __iterate_interfaces+0x92/0x130 [mac80211] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] Deadlock trace: INFO: task kworker/u128:0:48737 blocked for more than 122 seconds. Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] Call Trace: __mutex_lock.constprop.0+0x3d0/0x6d0 mt7925_mac_reset_work+0x85/0x170 [mt7925_common] == Related Links == Framework Community discussion: https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 OpenWrt GitHub issues: https://github.com/openwrt/mt76/issues/1014 https://github.com/openwrt/mt76/issues/1036 GitHub repository with additional analysis: https://github.com/zbowling/mt7925 Zac Bowling (17): wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c wifi: mt76: mt7925: add error handling for AMPDU MCU commands wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add wifi: mt76: mt7925: add error handling for BSS info in key setup wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions wifi: mt76: mt792x: fix NULL pointer dereference in TX path wifi: mt76: mt7925: add lockdep assertions for mutex verification wifi: mt76: mt7925: fix key removal failure during MLO roaming wifi: mt76: mt7925: fix kernel warning in MLO ROC setup wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions wifi: mt76: mt792x: fix firmware reload failure after previous load crash wifi: mt76: mt7925: add mutex protection in resume path wifi: mt76: mt7925: add NULL checks in link station and TX queue setup drivers/net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 8 +++++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 95 +++++++++++++++++++++--- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 52 ++++++++++++++--- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 6 +++ 5 files changed, 170 insertions(+), 18 deletions(-) -- 2.51.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 01/17] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 02/17] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling ` (16 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang mt792x_vif_to_bss_conf() can return NULL when iterating over valid_links during HW reset or other state transitions, because the link configuration in mac80211 may not be set up yet even though the driver's valid_links bitmap has the link marked as valid. This causes a NULL pointer dereference in mt76_connac_mcu_uni_add_dev() when it tries to access bss_conf->vif->type, and similar crashes in other functions that use bss_conf without checking. This crash was observed on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 (RZ717) running kernel 6.17. The panic occurs during BSSID roaming when the adapter attempts to switch to a better access point: BUG: kernel NULL pointer dereference, address: 0000000000000010 CPU: 1 UID: 0 PID: 8362 Comm: kworker/u128:10 Tainted: G OE Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] RIP: 0010:mt76_connac_mcu_uni_add_dev+0x9c/0x780 [mt76_connac_lib] Call Trace: mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] __iterate_interfaces+0x92/0x130 [mac80211] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] process_one_work+0x18b/0x370 worker_thread+0x317/0x450 The issue manifests approximately every 5 minutes when the adapter tries to hop to a better BSSID, causing system-wide hangs where network commands (ip, ifconfig, etc.) hang indefinitely. Add NULL checks for bss_conf before using it in: - mt7925_vif_connect_iter() - mt7925_change_vif_links() - mt7925_mac_sta_assoc() - mt7925_mac_sta_remove_links() Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Link: https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 6 ++++++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 871b67101976..184efe8afa10 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 2d358a96640c..3001a62a8b67 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1304,6 +1304,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } mt792x_mutex_release(dev); @@ -1630,6 +1632,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1861,6 +1865,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1876,6 +1882,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 02/17] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling 2026-01-05 0:26 ` [PATCH 01/17] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 03/17] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling ` (15 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang During firmware recovery and ROC (Remain On Channel) abort operations, the driver iterates over active interfaces and calls MCU functions that require the device mutex to be held, but the mutex was not acquired. This causes system-wide deadlocks where the system becomes completely unresponsive. From logs on affected systems: INFO: task kworker/u128:0:48737 blocked for more than 122 seconds. Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] Call Trace: __schedule+0x426/0x12c0 schedule+0x27/0xf0 schedule_preempt_disabled+0x15/0x30 __mutex_lock.constprop.0+0x3d0/0x6d0 mt7925_mac_reset_work+0x85/0x170 [mt7925_common] The deadlock manifests approximately every 5 minutes when the adapter tries to hop to a better BSSID, triggering firmware reset. Network commands (ip, ifconfig, etc.) hang indefinitely, processes get stuck in uninterruptible sleep (D state), and reboot hangs as well. Add mutex protection around interface iteration in: - mt7925_mac_reset_work(): Called during firmware recovery after MCU timeouts to reconnect all interfaces - mt7925_roc_abort_sync() in suspend path: Called during suspend to clean up Remain On Channel operations This matches the pattern used in mt7615 and other MediaTek drivers where interface iteration callbacks invoke MCU functions with mutex held: // mt7615/main.c - roc_work has mutex protection mt7615_mutex_acquire(phy->dev); ieee80211_iterate_active_interfaces(...); mt7615_mutex_release(phy->dev); Note: Sean Wang from MediaTek has submitted an alternative fix for the ROC path using cancel_delayed_work() instead of cancel_delayed_work_sync(). Both approaches address the deadlock; this one adds explicit mutex protection which may be superseded by the upstream fix. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Link: https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 184efe8afa10..06420ac6ed55 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1331,9 +1331,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index c4161754c01d..e9d62c6aee91 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -455,7 +455,9 @@ static int mt7925_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7925_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 03/17] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling 2026-01-05 0:26 ` [PATCH 01/17] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac Bowling 2026-01-05 0:26 ` [PATCH 02/17] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 04/17] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling ` (14 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Two additional code paths iterate over active interfaces and call MCU functions without proper mutex protection: 1. mt7925_set_runtime_pm(): Called when runtime PM settings change. The callback mt7925_pm_interface_iter() calls mt7925_mcu_set_beacon_filter() which in turn calls mt7925_mcu_set_rxfilter(). These MCU functions require the device mutex to be held. 2. mt7925_mlo_pm_work(): A workqueue function for MLO power management. The callback mt7925_mlo_pm_iter() was acquiring mutex internally, which is inconsistent with the rest of the driver where the caller holds the mutex during interface iteration. These bugs can cause deadlocks when: - Power management settings are changed while WiFi is active - MLO power save state transitions occur during roaming Move the mutex to the caller in mt7925_mlo_pm_work() for consistency with the rest of the driver, and add mutex protection in mt7925_set_runtime_pm(). Found through static analysis (clang-tidy) and comparison with the MT7615 driver which correctly acquires mutex before interface iteration. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 3001a62a8b67..9f17b21aef1c 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -751,9 +751,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } @@ -1301,14 +1303,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) return; - mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); if (!bss_conf) continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } - mt792x_mutex_release(dev); } void mt7925_mlo_pm_work(struct work_struct *work) @@ -1317,9 +1317,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) mlo_pm_work.work); struct ieee80211_hw *hw = mt76_hw(dev); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_mlo_pm_iter, dev); + mt792x_mutex_release(dev); } void mt7925_scan_work(struct work_struct *work) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 04/17] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (2 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 03/17] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 05/17] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling ` (13 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add NULL pointer checks for link_conf and mconf in: - mt7925_mcu_sta_phy_tlv(): builds PHY capability TLV for station record - mt7925_mcu_sta_rate_ctrl_tlv(): builds rate control TLV for station record Both functions call mt792x_vif_to_bss_conf() and mt792x_vif_to_link() which can return NULL during MLO link state transitions when the link configuration in mac80211 is not yet synchronized with the driver's link tracking. Without these checks, the driver will crash with a NULL pointer dereference when accessing link_conf->chanreq.oper or link_conf->basic_rates. Found through static analysis (clang-tidy pattern matching for unchecked return values from functions known to return NULL). Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index cf0fdea45cf7..d61a7fbda745 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1773,6 +1773,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1851,6 +1855,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 05/17] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (3 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 04/17] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 06/17] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling ` (12 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add NULL pointer checks throughout main.c for functions that call mt792x_vif_to_bss_conf(), mt792x_vif_to_link(), and mt792x_sta_to_link() without verifying the return value before dereferencing. Functions fixed: - mt7925_set_key(): Check link_conf, mconf, and mlink before use - mt7925_mac_link_sta_add(): Check link_conf before BSS info update - mt7925_mac_link_sta_assoc(): Check mlink and link_conf before use - mt7925_mac_link_sta_remove(): Check mlink and link_conf, add goto label for proper cleanup path - mt7925_change_vif_links(): Check link_conf before adding BSS These functions can receive NULL when the link configuration in mac80211 is not yet synchronized with the driver's link tracking during MLO operations or state transitions. Without these checks, the driver crashes during station add/remove/ association operations with NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000010 Call Trace: mt7925_mac_link_sta_add+0x... ... Found through static analysis and triggered during BSSID roaming on systems with multiple access points. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 27 ++++++++++++++++--- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 9f17b21aef1c..7d3322461bcf 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -604,6 +604,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) + return -EINVAL; + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -889,6 +893,8 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -1034,6 +1040,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1043,12 +1051,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1095,6 +1104,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1108,10 +1119,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1119,6 +1132,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -2031,6 +2045,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 06/17] wifi: mt76: mt7925: add error handling for AMPDU MCU commands 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (4 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 05/17] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 07/17] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling ` (11 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Check return values of mt7925_mcu_uni_rx_ba() and mt7925_mcu_uni_tx_ba() in mt7925_ampdu_action() and propagate errors to the caller. Previously, failures in these MCU commands were silently ignored, which could leave block aggregation in an inconsistent state between the driver and firmware. For IEEE80211_AMPDU_TX_STOP_CONT, only call the completion callback ieee80211_stop_tx_ba_cb_irqsafe() if the MCU command succeeded, to avoid signaling completion when the firmware operation failed. Found through code review - pattern of ignored return values throughout AMPDU handling. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7d3322461bcf..d966e5ab50ff 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1271,22 +1271,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1295,8 +1295,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 07/17] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (5 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 06/17] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 08/17] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling ` (10 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Check return value of mt7925_mcu_add_bss_info() in mt7925_mac_link_sta_add() and propagate errors to the caller. BSS info must be set up before adding a station record. If this MCU command fails, continuing with station add would leave the firmware in an inconsistent state with a station but no BSS configuration. This can cause undefined behavior in the firmware, particularly during MLO link setup where multiple BSS configurations are being programmed. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index d966e5ab50ff..a7e1e673c4bc 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -899,11 +899,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 08/17] wifi: mt76: mt7925: add error handling for BSS info in key setup 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (6 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 07/17] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 09/17] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling ` (9 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Check return value of mt7925_mcu_add_bss_info() in mt7925_set_link_key() when setting up cipher for the first time and propagate errors. The BSS info update with cipher information must succeed before key programming can proceed. If this MCU command fails, continuing with key setup would program keys into the firmware for a BSS that does not have the correct cipher configuration. SECURITY NOTE: Silent failure here is particularly dangerous because the user would believe encryption is active when the firmware may not have the cipher properly configured, potentially resulting in unencrypted or incorrectly encrypted traffic. This ensures the error is propagated up the stack rather than silently ignored. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index a7e1e673c4bc..058394b2e067 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -637,8 +637,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 09/17] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (7 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 08/17] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 10/17] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac Bowling ` (8 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add NULL pointer checks for mconf and link_conf in several functions that were missing validation after calling mt792x_vif_to_link() and mt792x_vif_to_bss_conf(). Functions fixed: - mt7925_mac_set_links(): Check both primary and secondary link_conf before dereferencing chanreq.oper for band selection - mt7925_link_info_changed(): Check mconf before using it to get link_conf, prevents NULL dereference chain - mt7925_assign_vif_chanctx(): Check mconf before use, return -EINVAL if NULL; check pri_link_conf before passing to MCU function - mt7925_unassign_vif_chanctx(): Check mconf before dereferencing, return early if NULL during MLO cleanup These functions handle MLO (Multi-Link Operation) scenarios where link configurations may not be fully set up when called, particularly during rapid link state transitions or error recovery paths. Prevents panics during WiFi 7 MLO link setup and teardown sequences. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 39 +++++++++++++++---- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 058394b2e067..852cf8ff842f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1006,18 +1006,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; + + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; + + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1927,7 +1938,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2136,9 +2152,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2167,6 +2188,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 10/17] wifi: mt76: mt792x: fix NULL pointer dereference in TX path 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (8 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 09/17] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 11/17] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling ` (7 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add NULL pointer checks in mt792x_tx() to prevent kernel crashes when transmitting packets during MLO link removal. The function calls mt792x_sta_to_link() which can return NULL if the link is being removed, but the return value was dereferenced without checking. Similarly, the RCU-protected link_conf and link_sta pointers were used without NULL validation. This race can occur when: 1. A packet is queued for transmission 2. Concurrently, the link is being removed (mt7925_mac_link_sta_remove) 3. mt792x_sta_to_link() returns NULL for the removed link 4. Kernel crashes on wcid = &mlink->wcid dereference Example crash trace: BUG: kernel NULL pointer dereference RIP: mt792x_tx+0x... Call Trace: ieee80211_tx+0x... __ieee80211_subif_start_xmit+0x... Fix by: - Check mlink return value before dereferencing wcid - Check RCU-dereferenced conf and link_sta before use - Free the SKB and return early if any pointer is NULL This affects both MT7921 and MT7925 drivers as mt792x_core.c is shared. Fixes: c74df1c067f2 ("wifi: mt76: mt792x: introduce mt792x-lib module") Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt792x_core.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index f2ed16feb6c1..9dc768aa8b9c 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, IEEE80211_TX_CTRL_MLO_LINK); sta = (struct mt792x_sta *)control->sta->drv_priv; mlink = mt792x_sta_to_link(sta, link_id); + if (!mlink) + goto free_skb; wcid = &mlink->wcid; } @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, link_id = wcid->link_id; rcu_read_lock(); conf = rcu_dereference(vif->link_conf[link_id]); - memcpy(hdr->addr2, conf->addr, ETH_ALEN); - link_sta = rcu_dereference(control->sta->link[link_id]); + if (!conf || !link_sta) { + rcu_read_unlock(); + goto free_skb; + } + memcpy(hdr->addr2, conf->addr, ETH_ALEN); memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); if (vif->type == NL80211_IFTYPE_STATION) @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, } mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); + return; + +free_skb: + ieee80211_free_txskb(hw, skb); } EXPORT_SYMBOL_GPL(mt792x_tx); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 11/17] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (9 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 10/17] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 12/17] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling ` (6 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. The lockdep assertions will trigger warnings at runtime if code paths exist that call these functions without proper mutex protection. This aids in detecting the class of bugs fixed by patches in this series. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index d61a7fbda745..958ff9da9f01 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1527,6 +1527,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2037,6 +2039,8 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, struct mt792x_sta *msta; struct mt792x_link_sta *mlink; + lockdep_assert_held(&dev->mt76.mutex); + if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); @@ -2843,6 +2847,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 12/17] wifi: mt76: mt7925: fix key removal failure during MLO roaming 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (10 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 11/17] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 13/17] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac Bowling ` (5 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang During MLO roaming, mac80211 may request key removal after the link state has already been torn down. The current code returns -EINVAL when link_conf, mconf, or mlink is NULL, causing 'failed to remove key from hardware (-22)' errors in the kernel log. This is a race condition where: 1. MLO link teardown begins, cleaning up driver state 2. mac80211 requests group key removal for the old link 3. mt792x_vif_to_bss_conf() or related functions return NULL 4. Driver returns -EINVAL, confusing upper layers Observed kernel log errors during roaming: wlp192s0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-22) wlp192s0: failed to remove key (4, ff:ff:ff:ff:ff:ff) from hardware (-22) And associated wpa_supplicant warnings: nl80211: kernel reports: link ID must for MLO group key The fix: When removing a key (cmd != SET_KEY), if the link state is already gone, return success (0) instead of error. The key is effectively removed when the link was torn down. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 852cf8ff842f..7cf6faa1f6f4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -605,8 +605,15 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); - if (!link_conf || !mconf || !mlink) + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; return -EINVAL; + } wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 13/17] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (11 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 12/17] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 14/17] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling ` (4 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang mt7925_mcu_set_mlo_roc() uses WARN_ON_ONCE() to check if link_conf or channel is NULL. However, during MLO AP setup, it's normal for the channel to not be configured yet when this function is called. The WARN_ON_ONCE triggers a kernel warning/oops that makes the system appear to have crashed, even though it's just a timing issue. Example kernel oops during AP setup: WARNING: CPU: 0 PID: 12345 at drivers/net/wireless/mediatek/mt76/mt7925/mcu.c:1345 Call Trace: mt7925_mcu_set_mlo_roc+0x... mt7925_remain_on_channel+0x... Replace WARN_ON_ONCE with regular NULL checks and return -ENOLINK to indicate the link is not fully configured yet. This allows the upper layers to retry when the link is ready, without spamming the kernel log with warnings. Also add a check for mconf in the first loop to match the pattern used in the second loop, preventing potential NULL dereference. This fixes kernel oops reported during MLO AP setup on OpenWrt with MT7925E hardware and similar issues on standard Linux distributions. Fixes: c5d11e4a9fa8 ("wifi: mt76: mt7925: add mt7925_change_vif_links") Link: https://github.com/openwrt/mt76/issues/1014 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 958ff9da9f01..8080fea30d23 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1337,15 +1337,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1359,8 +1367,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 14/17] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (12 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 13/17] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 15/17] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling ` (3 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Several MCU functions dereference pointers returned by mt792x_sta_to_link() and mt792x_vif_to_link() without checking for NULL. During MLO state transitions, these functions can return NULL when link state is being set up or torn down, causing kernel NULL pointer dereferences. Add NULL checks in the following functions: - mt7925_mcu_sta_hdr_trans_tlv(): Check mlink before dereferencing wcid - mt7925_mcu_wtbl_update_hdr_trans(): Check mlink and mconf before use - mt7925_mcu_sta_amsdu_tlv(): Check mlink before setting amsdu flag - mt7925_mcu_sta_mld_tlv(): Check mconf and mlink in link iteration loop - mt7925_mcu_sta_update(): Initialize mlink to NULL and check both link_sta and mlink in the ternary condition These race conditions can occur during: - MLO link setup/teardown - Station add/remove operations - Firmware command generation during state transitions Found through static analysis (clang-tidy) and pattern matching similar to fixes in mt7996 and ath12k drivers for MLO link state handling. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 8080fea30d23..6f7fc1b9a440 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1087,6 +1087,8 @@ mt7925_mcu_sta_hdr_trans_tlv(struct sk_buff *skb, struct mt792x_link_sta *mlink; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; wcid = &mlink->wcid; } else { wcid = &mvif->sta.deflink.wcid; @@ -1120,6 +1122,9 @@ int mt7925_mcu_wtbl_update_hdr_trans(struct mt792x_dev *dev, link_sta = mt792x_sta_to_link_sta(vif, sta, link_id); mconf = mt792x_vif_to_link(mvif, link_id); + if (!mlink || !mconf) + return -EINVAL; + skb = __mt76_connac_mcu_alloc_sta_req(&dev->mt76, &mconf->mt76, &mlink->wcid, MT7925_STA_UPDATE_MAX_SIZE); @@ -1751,6 +1756,8 @@ mt7925_mcu_sta_amsdu_tlv(struct sk_buff *skb, amsdu->amsdu_en = true; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mlink->wcid.amsdu = true; switch (link_sta->agg.max_amsdu_len) { @@ -1953,6 +1960,9 @@ mt7925_mcu_sta_mld_tlv(struct sk_buff *skb, mconf = mt792x_vif_to_link(mvif, i); mlink = mt792x_sta_to_link(msta, i); + if (!mconf || !mlink) + continue; + mld->link[cnt].wlan_id = cpu_to_le16(mlink->wcid.idx); mld->link[cnt++].bss_idx = mconf->mt76.idx; @@ -2045,7 +2055,7 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; lockdep_assert_held(&dev->mt76.mutex); @@ -2053,7 +2063,7 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 15/17] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (13 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 14/17] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 16/17] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling ` (2 subsequent siblings) 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang If the firmware loading process crashes or is interrupted after acquiring the patch semaphore but before releasing it, subsequent firmware load attempts will fail with 'Failed to get patch semaphore' because the semaphore is still held. This issue manifests as devices becoming unusable after suspend/resume failures or firmware crashes, requiring a full hardware reboot to recover. This has been widely reported on MT7921 and MT7925 devices. Example error log: mt7921e 0000:c2:00.0: Failed to get patch semaphore mt7921e 0000:c2:00.0: probe with driver mt7921e failed with error -5 Apply the same fix that was applied to MT7915 in commit 79dd14f: 1. Release the patch semaphore before starting firmware load (in case it was held by a previous failed attempt) 2. Restart MCU firmware to ensure clean state 3. Wait briefly for MCU to be ready This fix applies to both MT7921 and MT7925 drivers which share the mt792x_load_firmware() function. Fixes: 583204ae70f9 ("wifi: mt76: mt792x: move mt7921_load_firmware in mt792x-lib module") Link: https://github.com/openwrt/mt76/commit/79dd14f2e8161b656341b6653261779199aedbe4 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt792x_core.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index 9dc768aa8b9c..05598202b488 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -936,6 +936,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) { int ret; + /* Release semaphore if taken by previous failed load attempt. + * This prevents "Failed to get patch semaphore" errors when + * recovering from firmware crashes or suspend/resume failures. + */ + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); + if (ret < 0) + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); + + /* Always restart MCU to ensure clean state before loading firmware */ + mt76_connac_mcu_restart(&dev->mt76); + + /* Wait for MCU to be ready after restart */ + msleep(100); + ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); if (ret) return ret; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 16/17] wifi: mt76: mt7925: add mutex protection in resume path 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (14 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 15/17] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-05 0:26 ` [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac Bowling 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang 17 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add mutex protection around mt7925_mcu_set_deep_sleep() and mt7925_mcu_regd_update() calls in the resume path to prevent potential race conditions during resume operations. These MCU operations require serialization, and the resume path was the only call site missing mutex protection. Without this, concurrent access during resume could corrupt firmware state or cause race conditions with other MCU commands. Found by static analysis (sparse/coccinelle) pattern matching for unprotected MCU function calls. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index e9d62c6aee91..3a9e32a1759d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -584,10 +584,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) } /* restore previous ds setting */ + mt792x_mutex_acquire(dev); if (!pm->ds_enable) mt7925_mcu_set_deep_sleep(dev, false); mt7925_mcu_regd_update(dev, mdev->alpha2, dev->country_ie_env); + mt792x_mutex_release(dev); failed: pm->suspended = false; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (15 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 16/17] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling @ 2026-01-05 0:26 ` Zac Bowling 2026-01-11 3:13 ` Zac Bowling 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang 17 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-05 0:26 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Add NULL pointer checks for mt792x_sta_to_link() and mt792x_vif_to_link() results in critical paths to prevent kernel crashes during MLO operations. Functions fixed: - mt7925_mac_link_sta_add(): Check mlink and mconf before dereferencing - mt7925_conf_tx(): Check mconf before accessing queue_params These can be NULL during MLO link setup/teardown when mac80211 state may not be fully synchronized with driver state. Found through static analysis and pattern matching. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7cf6faa1f6f4..81373e479abd 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -871,12 +871,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -1735,6 +1740,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; -- 2.51.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup 2026-01-05 0:26 ` [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac Bowling @ 2026-01-11 3:13 ` Zac Bowling 2026-01-11 3:36 ` Zac Bowling 0 siblings, 1 reply; 113+ messages in thread From: Zac Bowling @ 2026-01-11 3:13 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi folks, Any chance you had time to review? I got a new laptop courtesy the folks at Framework to keep debugging this that also has this same wifi chip, and it didn't take long to reproduce this issue on an unpatched kernel that these patches would fix. Not even 5 minutes after boot this new laptop with a fresh arch install, and it crashed. Patch 1 is the direct fix. Patch 2 prevents the underlying race condition in this specific crash. Dump from this machine. My second patch in this series would stop this particular race and null deref during MLO roaming. Jan 10 18:38:57 zcache kernel: wlan0: RX ReassocResp from d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=0) Jan 10 18:38:57 zcache kernel: wlan0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 895 TU (916 ms) Jan 10 18:38:57 zcache kernel: wlan0: RX ReassocResp from d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=0) Jan 10 18:38:57 zcache kernel: wlan0: d8:b3:70:f8:9e:7b rejected association temporarily; comeback duration 794 TU (813 ms) Jan 10 18:38:57 zcache kernel: wlan0: association with d8:b3:70:f8:9e:7b timed out Jan 10 18:39:01 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 11) timeout Jan 10 18:39:07 zcache kernel: mt7925e 0000:bf:00.0: Message 00020003 (seq 12) timeout Jan 10 18:39:10 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 13) timeout Jan 10 18:39:13 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 14) timeout Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: Message 00020001 (seq 15) timeout Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: HW/SW Version: 0x8a108a10, Build Time: 20251015212927a Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: WM Firmware Version: ____000000, Build Time: 20251015213023 Jan 10 18:39:17 zcache kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Jan 10 18:39:17 zcache kernel: #PF: supervisor read access in kernel mode Jan 10 18:39:17 zcache kernel: #PF: error_code(0x0000) - not-present page Jan 10 18:39:17 zcache kernel: PGD 0 P4D 0 Jan 10 18:39:17 zcache kernel: Oops: Oops: 0000 [#1] SMP NOPTI Jan 10 18:39:17 zcache kernel: CPU: 0 UID: 0 PID: 42446 Comm: kworker/u96:0 Tainted: G O 6.18.4-2-cachyos #1 PREEMPT(full) b0274bd1b2c7bedbf3a9a6159178cc392f0fdb5c Jan 10 18:39:17 zcache kernel: Tainted: [O]=OOT_MODULE Jan 10 18:39:17 zcache kernel: Hardware name: Framework Laptop 16 (AMD Ryzen AI 300 Series)/FRANMHCP09, BIOS 03.04 11/06/2025 Jan 10 18:39:17 zcache kernel: Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] Jan 10 18:39:17 zcache kernel: RIP: 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] Jan 10 18:39:17 zcache kernel: Code: 0f b7 89 b8 00 00 00 66 89 4c 24 1c c7 44 24 1e 00 00 00 00 66 89 4c 24 22 66 c7 44 24 24 00 00 c6 44 24 26 00 44 88 4c 24 27 <48> 8b 0e 8b 09 83 f9 05 7f 11 83 f9 01 74 2f 83 f9 02 74 3c 83 f9 Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: ffff8a0f37682060 RCX: 0000000000000013 Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: 0000000000000000 RDI: 0000000000000000 Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: ffffffff999362b0 R12: 0000000000000001 Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: ffff8a0f37682060 R15: ffff8a0f0cd12588 Jan 10 18:39:17 zcache kernel: FS: 0000000000000000(0000) GS:ffff8a12b18fe000(0000) knlGS:0000000000000000 Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 CR3: 00000003f0a12000 CR4: 0000000000f50ef0 Jan 10 18:39:17 zcache kernel: PKRU: 55555554 Jan 10 18:39:17 zcache kernel: Call Trace: Jan 10 18:39:17 zcache kernel: <TASK> Jan 10 18:39:17 zcache kernel: mt7925_vif_connect_iter+0x95/0x190 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: __iterate_interfaces+0x55/0x130 [mac80211 9bef1c01c9f6e23856cab5358ede5658fefb3669] Jan 10 18:39:17 zcache kernel: ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: ieee80211_iterate_interfaces+0x3b/0x50 [mac80211 9bef1c01c9f6e23856cab5358ede5658fefb3669] Jan 10 18:39:17 zcache kernel: mt7925_mac_reset_work+0x2a3/0x360 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: ? enable_work+0x2b/0xd0 Jan 10 18:39:17 zcache kernel: process_scheduled_works+0x24b/0x5a0 Jan 10 18:39:17 zcache kernel: worker_thread+0x188/0x360 Jan 10 18:39:17 zcache kernel: ? __pfx_worker_thread+0x10/0x10 Jan 10 18:39:17 zcache kernel: kthread+0x217/0x250 Jan 10 18:39:17 zcache kernel: ? __pfx_kthread+0x10/0x10 Jan 10 18:39:17 zcache kernel: ret_from_fork+0xf1/0x1f0 Jan 10 18:39:17 zcache kernel: ? __pfx_kthread+0x10/0x10 Jan 10 18:39:17 zcache kernel: ret_from_fork_asm+0x1a/0x30 Jan 10 18:39:17 zcache kernel: </TASK> Jan 10 18:39:17 zcache kernel: Modules linked in: overlay tcp_diag inet_diag ccm snd_seq_dummy snd_hrtimer rfcomm snd_seq snd_seq_device tun cmac algif_hash algif_skcipher af_alg bnep snd_ctl_led vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_pdm snd_acp_i2s snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match soundwire_amd soundwire_generic_allocation snd_amd_sdw_acpi snd_hda_codec_alc269 soundwire_bus intel_rapl_msr amd_atl snd_hda_scodec_component snd_soc_sdca intel_rapl_common snd_hda_codec_realtek_lib mt7925e snd_soc_core snd_hda_codec_generic mt7925_common snd_hda_codec_atihdmi ac97_bus snd_hda_codec_nvhdmi mt792x_lib snd_pcm_dmaengine snd_hda_codec_hdmi snd_compress mt76_connac_lib snd_rpl_pci_acp6x mt76 hid_sensor_als snd_acp_pci hid_sensor_trigger snd_hda_intel snd_amd_acpi_mach industrialio_triggered_buffer uvcvideo Jan 10 18:39:17 zcache kernel: uvc mac80211 snd_acp_legacy_common kfifo_buf videobuf2_vmalloc mousedev snd_hda_codec kvm_amd hid_sensor_iio_common snd_pci_acp6x leds_cros_ec videobuf2_memops btusb btmtk ucsi_acpi snd_pci_acp5x snd_hda_core videobuf2_v4l2 industrialio cros_ec_sysfs cros_charge_control cros_ec_debugfs gpio_cros_ec led_class_multicolor videobuf2_common cros_ec_chardev cros_ec_hwmon typec_ucsi snd_rn_pci_acp3x amd_pmf btbcm snd_intel_dspcfg kvm cfg80211 snd_intel_sdw_acpi ip6t_REJECT typec amdtee snd_acp_config btintel snd_hwdep spd5118 videodev nf_reject_ipv6 hid_sensor_hub amd_sfh cros_ec_dev hid_multitouch irqbypass btrtl snd_soc_acpi roles snd_pcm polyval_clmulni platform_profile cros_ec_lpcs mc ccp snd_pci_acp3x snd_timer bluetooth ghash_clmulni_intel joydev xt_hl aesni_intel snd cros_ec i2c_piix4 thunderbolt rfkill i2c_hid_acpi ip6t_rt tee 8250_dw amd_pmc soundcore cros_ec_proto amdxdna wmi_bmof pcspkr k10temp nvidia_wmi_ec_backlight rapl i2c_smbus libarc4 i2c_hid mac_hid ipt_REJECT nf_reject_ipv4 xt_LOG Jan 10 18:39:17 zcache kernel: nf_log_syslog nft_limit xt_limit xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat x_tables nf_tables dm_mod pkcs8_key_parser crypto_user ntsync i2c_dev nfnetlink zram 842_compress 842_decompress lz4hc_compress lz4_compress nvidia_uvm(O) amdgpu nvidia_drm(O) nvidia_modeset(O) drm_panel_backlight_quirks drm_buddy drm_suballoc_helper drm_exec nvme i2c_algo_bit gpu_sched nvme_core amdxcp nvme_keyring drm_display_helper nvme_auth cec hkdf nvidia(O) drm_ttm_helper video ttm wmi Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 Jan 10 18:39:17 zcache kernel: ---[ end trace 0000000000000000 ]--- Jan 10 18:39:17 zcache kernel: RIP: 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] Jan 10 18:39:17 zcache kernel: Code: 0f b7 89 b8 00 00 00 66 89 4c 24 1c c7 44 24 1e 00 00 00 00 66 89 4c 24 22 66 c7 44 24 24 00 00 c6 44 24 26 00 44 88 4c 24 27 <48> 8b 0e 8b 09 83 f9 05 7f 11 83 f9 01 74 2f 83 f9 02 74 3c 83 f9 Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: ffff8a0f37682060 RCX: 0000000000000013 Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: 0000000000000000 RDI: 0000000000000000 Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: ffffffff999362b0 R12: 0000000000000001 Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: ffff8a0f37682060 R15: ffff8a0f0cd12588 Jan 10 18:39:17 zcache kernel: FS: 0000000000000000(0000) GS:ffff8a12b18fe000(0000) knlGS:0000000000000000 Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Zac Bowling On Sun, Jan 4, 2026 at 4:27 PM Zac Bowling <zbowling@gmail.com> wrote: > > Add NULL pointer checks for mt792x_sta_to_link() and mt792x_vif_to_link() > results in critical paths to prevent kernel crashes during MLO operations. > > Functions fixed: > - mt7925_mac_link_sta_add(): Check mlink and mconf before dereferencing > - mt7925_conf_tx(): Check mconf before accessing queue_params > > These can be NULL during MLO link setup/teardown when mac80211 state > may not be fully synchronized with driver state. > > Found through static analysis and pattern matching. > > Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > index 7cf6faa1f6f4..81373e479abd 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > @@ -871,12 +871,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > mlink = mt792x_sta_to_link(msta, link_id); > + if (!mlink) > + return -EINVAL; > > idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > if (idx < 0) > return -ENOSPC; > > mconf = mt792x_vif_to_link(mvif, link_id); > + if (!mconf) > + return -EINVAL; > + > mt76_wcid_init(&mlink->wcid, 0); > mlink->wcid.sta = 1; > mlink->wcid.idx = idx; > @@ -1735,6 +1740,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > [IEEE80211_AC_BK] = 1, > }; > > + if (!mconf) > + return -EINVAL; > + > /* firmware uses access class index */ > mconf->queue_params[mq_to_aci[queue]] = *params; > > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup 2026-01-11 3:13 ` Zac Bowling @ 2026-01-11 3:36 ` Zac Bowling 0 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-11 3:36 UTC (permalink / raw) To: zbowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Here is another related stack trace from same crash below. Break down of what is happening in the journal: 18:33:46 - Connected successfully to the AP on 6 GHz (WiFi 6E/7 MLO), got DHCP lease 18:38:54 - Roaming triggered (probably signal-based) - wlan0: disconnect from AP d8:b3:70:f8:9e:7b for new auth to d8:b3:70:f8:9e:7b - Tried to roam to a different link on the same MLO AP (6535 MHz → 5765 MHz) - MLO errors appear: nl80211: kernel reports: link ID must for MLO group key (3x) - nl80211: kernel reports: Error fetching BSS for link - wlan0: SME: Association request to the driver failed 18:38:55-57 - Falling back, trying 5 GHz link - Authenticated OK, but association keeps failing - AP saying "comeback later" (status=30, comeback duration) - wlan0: association with d8:b3:70:f8:9e:7b timed out 18:39:01 - Driver starts dying - MCU message timeouts (firmware not responding) - Driver triggers reset work → NULL pointer crash The root cause is MLO roaming between links on the same AP. The driver failed to properly handle the association failure, firmware became unresponsive, and then the reset path crashed because it passed a NULL vif/bss pointer to mt76_connac_mcu_uni_add_dev. This is the same WiFi 7 MLO bug - the driver doesn't properly handle the case where link association fails during roaming. Maybe more issues in the firmware in this case too but what we do in the kernel is dangerous when it does happen. Jan 10 11:44:03 zcache kernel: mt7925e 0000:bf:00.0: ASIC revision: 79250000 Jan 10 11:44:03 zcache kernel: mt7925e 0000:bf:00.0: HW/SW Version: 0x8a108a10, Build Time: 20251015212927a Jan 10 11:44:03 zcache kernel: mt7925e 0000:bf:00.0: WM Firmware Version: ____000000, Build Time: 20251015213023 Jan 10 18:39:01 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 11) timeout Jan 10 18:39:07 zcache kernel: mt7925e 0000:bf:00.0: Message 00020003 (seq 12) timeout Jan 10 18:39:10 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 13) timeout Jan 10 18:39:13 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 (seq 14) timeout Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: Message 00020001 (seq 15) timeout Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: HW/SW Version: 0x8a108a10, Build Time: 20251015212927a Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: WM Firmware Version: ____000000, Build Time: 20251015213023 Jan 10 18:39:17 zcache kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Jan 10 18:39:17 zcache kernel: #PF: supervisor read access in kernel mode Jan 10 18:39:17 zcache kernel: #PF: error_code(0x0000) - not-present page Jan 10 18:39:17 zcache kernel: PGD 0 P4D 0 Jan 10 18:39:17 zcache kernel: Oops: Oops: 0000 [#1] SMP NOPTI Jan 10 18:39:17 zcache kernel: CPU: 0 UID: 0 PID: 42446 Comm: kworker/u96:0 Tainted: G O 6.18.4-2-cachyos #1 PREEMPT(full) b0274bd1b2c7bedbf3a9a6159178cc392f0fdb5c Jan 10 18:39:17 zcache kernel: Tainted: [O]=OOT_MODULE Jan 10 18:39:17 zcache kernel: Hardware name: Framework Laptop 16 (AMD Ryzen AI 300 Series)/FRANMHCP09, BIOS 03.04 11/06/2025 Jan 10 18:39:17 zcache kernel: Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] Jan 10 18:39:17 zcache kernel: RIP: 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: ffff8a0f37682060 RCX: 0000000000000013 Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: 0000000000000000 RDI: 0000000000000000 Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: ffffffff999362b0 R12: 0000000000000001 Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: ffff8a0f37682060 R15: ffff8a0f0cd12588 Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 CR3: 00000003f0a12000 CR4: 0000000000f50ef0 Jan 10 18:39:17 zcache kernel: Call Trace: Jan 10 18:39:17 zcache kernel: <TASK> Jan 10 18:39:17 zcache kernel: mt7925_vif_connect_iter+0x95/0x190 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: ? __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: mt7925_mac_reset_work+0x2a3/0x360 [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] Jan 10 18:39:17 zcache kernel: </TASK> Jan 10 18:39:17 zcache kernel: Modules linked in: overlay tcp_diag inet_diag ccm snd_seq_dummy snd_hrtimer rfcomm snd_seq snd_seq_device tun cmac algif_hash algif_skcipher af_alg bnep snd_ctl_led vfat fat snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_pdm snd_acp_i2s snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match soundwire_amd soundwire_generic_allocation snd_amd_sdw_acpi snd_hda_codec_alc269 soundwire_bus intel_rapl_msr amd_atl snd_hda_scodec_component snd_soc_sdca intel_rapl_common snd_hda_codec_realtek_lib mt7925e snd_soc_core snd_hda_codec_generic mt7925_common snd_hda_codec_atihdmi ac97_bus snd_hda_codec_nvhdmi mt792x_lib snd_pcm_dmaengine snd_hda_codec_hdmi snd_compress mt76_connac_lib snd_rpl_pci_acp6x mt76 hid_sensor_als snd_acp_pci hid_sensor_trigger snd_hda_intel snd_amd_acpi_mach industrialio_triggered_buffer uvcvideo Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 Jan 10 18:39:17 zcache kernel: ---[ end trace 0000000000000000 ]--- Jan 10 18:39:17 zcache kernel: RIP: 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: ffff8a0f37682060 RCX: 0000000000000013 Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: 0000000000000000 RDI: 0000000000000000 Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: ffffffff999362b0 R12: 0000000000000001 Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: ffff8a0f37682060 R15: ffff8a0f0cd12588 Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 CR3: 00000003f0a12000 CR4: 0000000000f50ef0 Zac Bowling On Sat, Jan 10, 2026 at 7:13 PM Zac Bowling <zbowling@gmail.com> wrote: > > Hi folks, > > Any chance you had time to review? I got a new laptop courtesy the folks at > Framework to keep debugging this that also has this same wifi chip, and it > didn't take long to reproduce this issue on an unpatched kernel that these > patches would fix. > > Not even 5 minutes after boot this new laptop with a fresh arch install, and it > crashed. > > Patch 1 is the direct fix. Patch 2 prevents the underlying race > condition in this > specific crash. > > Dump from this machine. My second patch in this series would stop this > particular race and null deref during MLO roaming. > > Jan 10 18:38:57 zcache kernel: wlan0: RX ReassocResp from > d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=0) > Jan 10 18:38:57 zcache kernel: wlan0: d8:b3:70:f8:9e:7b rejected > association temporarily; comeback duration 895 TU (916 ms) > Jan 10 18:38:57 zcache kernel: wlan0: RX ReassocResp from > d8:b3:70:f8:9e:7b (capab=0x1111 status=30 aid=0) > Jan 10 18:38:57 zcache kernel: wlan0: d8:b3:70:f8:9e:7b rejected > association temporarily; comeback duration 794 TU (813 ms) > Jan 10 18:38:57 zcache kernel: wlan0: association with > d8:b3:70:f8:9e:7b timed out > Jan 10 18:39:01 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 > (seq 11) timeout > Jan 10 18:39:07 zcache kernel: mt7925e 0000:bf:00.0: Message 00020003 > (seq 12) timeout > Jan 10 18:39:10 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 > (seq 13) timeout > Jan 10 18:39:13 zcache kernel: mt7925e 0000:bf:00.0: Message 00020002 > (seq 14) timeout > Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: Message 00020001 > (seq 15) timeout > Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: HW/SW Version: > 0x8a108a10, Build Time: 20251015212927a > Jan 10 18:39:16 zcache kernel: mt7925e 0000:bf:00.0: WM Firmware > Version: ____000000, Build Time: 20251015213023 > Jan 10 18:39:17 zcache kernel: BUG: kernel NULL pointer dereference, > address: 0000000000000000 > Jan 10 18:39:17 zcache kernel: #PF: supervisor read access in kernel mode > Jan 10 18:39:17 zcache kernel: #PF: error_code(0x0000) - not-present page > Jan 10 18:39:17 zcache kernel: PGD 0 P4D 0 > Jan 10 18:39:17 zcache kernel: Oops: Oops: 0000 [#1] SMP NOPTI > Jan 10 18:39:17 zcache kernel: CPU: 0 UID: 0 PID: 42446 Comm: > kworker/u96:0 Tainted: G O 6.18.4-2-cachyos #1 > PREEMPT(full) b0274bd1b2c7bedbf3a9a6159178cc392f0fdb5c > Jan 10 18:39:17 zcache kernel: Tainted: [O]=OOT_MODULE > Jan 10 18:39:17 zcache kernel: Hardware name: Framework Laptop 16 (AMD > Ryzen AI 300 Series)/FRANMHCP09, BIOS 03.04 11/06/2025 > Jan 10 18:39:17 zcache kernel: Workqueue: mt76 mt7925_mac_reset_work > [mt7925_common] > Jan 10 18:39:17 zcache kernel: RIP: > 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] > Jan 10 18:39:17 zcache kernel: Code: 0f b7 89 b8 00 00 00 66 89 4c 24 > 1c c7 44 24 1e 00 00 00 00 66 89 4c 24 22 66 c7 44 24 24 00 00 c6 44 > 24 26 00 44 88 4c 24 27 <48> 8b 0e 8b 09 83 f9 05 7f 11 83 f9 01 74 2f > 83 f9 02 74 3c 83 f9 > Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 > Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: > ffff8a0f37682060 RCX: 0000000000000013 > Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: > 0000000000000000 RDI: 0000000000000000 > Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: > 0000000000000000 R09: 0000000000000000 > Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: > ffffffff999362b0 R12: 0000000000000001 > Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: > ffff8a0f37682060 R15: ffff8a0f0cd12588 > Jan 10 18:39:17 zcache kernel: FS: 0000000000000000(0000) > GS:ffff8a12b18fe000(0000) knlGS:0000000000000000 > Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 CR3: > 00000003f0a12000 CR4: 0000000000f50ef0 > Jan 10 18:39:17 zcache kernel: PKRU: 55555554 > Jan 10 18:39:17 zcache kernel: Call Trace: > Jan 10 18:39:17 zcache kernel: <TASK> > Jan 10 18:39:17 zcache kernel: mt7925_vif_connect_iter+0x95/0x190 > [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] > Jan 10 18:39:17 zcache kernel: __iterate_interfaces+0x55/0x130 > [mac80211 9bef1c01c9f6e23856cab5358ede5658fefb3669] > Jan 10 18:39:17 zcache kernel: ? > __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common > dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] > Jan 10 18:39:17 zcache kernel: ? > __pfx_mt7925_vif_connect_iter+0x10/0x10 [mt7925_common > dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] > Jan 10 18:39:17 zcache kernel: ieee80211_iterate_interfaces+0x3b/0x50 > [mac80211 9bef1c01c9f6e23856cab5358ede5658fefb3669] > Jan 10 18:39:17 zcache kernel: mt7925_mac_reset_work+0x2a3/0x360 > [mt7925_common dc0066c6b1da3a3d20cb990f664250b31cf0a3c5] > Jan 10 18:39:17 zcache kernel: ? enable_work+0x2b/0xd0 > Jan 10 18:39:17 zcache kernel: process_scheduled_works+0x24b/0x5a0 > Jan 10 18:39:17 zcache kernel: worker_thread+0x188/0x360 > Jan 10 18:39:17 zcache kernel: ? __pfx_worker_thread+0x10/0x10 > Jan 10 18:39:17 zcache kernel: kthread+0x217/0x250 > Jan 10 18:39:17 zcache kernel: ? __pfx_kthread+0x10/0x10 > Jan 10 18:39:17 zcache kernel: ret_from_fork+0xf1/0x1f0 > Jan 10 18:39:17 zcache kernel: ? __pfx_kthread+0x10/0x10 > Jan 10 18:39:17 zcache kernel: ret_from_fork_asm+0x1a/0x30 > Jan 10 18:39:17 zcache kernel: </TASK> > Jan 10 18:39:17 zcache kernel: Modules linked in: overlay tcp_diag > inet_diag ccm snd_seq_dummy snd_hrtimer rfcomm snd_seq snd_seq_device > tun cmac algif_hash algif_skcipher af_alg bnep snd_ctl_led vfat fat > snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn > snd_acp70 snd_acp_pdm snd_acp_i2s snd_soc_dmic snd_acp_pcm > snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh > snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci > snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps > snd_soc_acpi_amd_match soundwire_amd soundwire_generic_allocation > snd_amd_sdw_acpi snd_hda_codec_alc269 soundwire_bus intel_rapl_msr > amd_atl snd_hda_scodec_component snd_soc_sdca intel_rapl_common > snd_hda_codec_realtek_lib mt7925e snd_soc_core snd_hda_codec_generic > mt7925_common snd_hda_codec_atihdmi ac97_bus snd_hda_codec_nvhdmi > mt792x_lib snd_pcm_dmaengine snd_hda_codec_hdmi snd_compress > mt76_connac_lib snd_rpl_pci_acp6x mt76 hid_sensor_als snd_acp_pci > hid_sensor_trigger snd_hda_intel snd_amd_acpi_mach > industrialio_triggered_buffer uvcvideo > Jan 10 18:39:17 zcache kernel: uvc mac80211 snd_acp_legacy_common > kfifo_buf videobuf2_vmalloc mousedev snd_hda_codec kvm_amd > hid_sensor_iio_common snd_pci_acp6x leds_cros_ec videobuf2_memops > btusb btmtk ucsi_acpi snd_pci_acp5x snd_hda_core videobuf2_v4l2 > industrialio cros_ec_sysfs cros_charge_control cros_ec_debugfs > gpio_cros_ec led_class_multicolor videobuf2_common cros_ec_chardev > cros_ec_hwmon typec_ucsi snd_rn_pci_acp3x amd_pmf btbcm > snd_intel_dspcfg kvm cfg80211 snd_intel_sdw_acpi ip6t_REJECT typec > amdtee snd_acp_config btintel snd_hwdep spd5118 videodev > nf_reject_ipv6 hid_sensor_hub amd_sfh cros_ec_dev hid_multitouch > irqbypass btrtl snd_soc_acpi roles snd_pcm polyval_clmulni > platform_profile cros_ec_lpcs mc ccp snd_pci_acp3x snd_timer bluetooth > ghash_clmulni_intel joydev xt_hl aesni_intel snd cros_ec i2c_piix4 > thunderbolt rfkill i2c_hid_acpi ip6t_rt tee 8250_dw amd_pmc soundcore > cros_ec_proto amdxdna wmi_bmof pcspkr k10temp nvidia_wmi_ec_backlight > rapl i2c_smbus libarc4 i2c_hid mac_hid ipt_REJECT nf_reject_ipv4 > xt_LOG > Jan 10 18:39:17 zcache kernel: nf_log_syslog nft_limit xt_limit > xt_addrtype xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 nft_compat x_tables nf_tables dm_mod pkcs8_key_parser > crypto_user ntsync i2c_dev nfnetlink zram 842_compress 842_decompress > lz4hc_compress lz4_compress nvidia_uvm(O) amdgpu nvidia_drm(O) > nvidia_modeset(O) drm_panel_backlight_quirks drm_buddy > drm_suballoc_helper drm_exec nvme i2c_algo_bit gpu_sched nvme_core > amdxcp nvme_keyring drm_display_helper nvme_auth cec hkdf nvidia(O) > drm_ttm_helper video ttm wmi > Jan 10 18:39:17 zcache kernel: CR2: 0000000000000000 > Jan 10 18:39:17 zcache kernel: ---[ end trace 0000000000000000 ]--- > Jan 10 18:39:17 zcache kernel: RIP: > 0010:mt76_connac_mcu_uni_add_dev+0xf1/0x210 [mt76_connac_lib] > Jan 10 18:39:17 zcache kernel: Code: 0f b7 89 b8 00 00 00 66 89 4c 24 > 1c c7 44 24 1e 00 00 00 00 66 89 4c 24 22 66 c7 44 24 24 00 00 c6 44 > 24 26 00 44 88 4c 24 27 <48> 8b 0e 8b 09 83 f9 05 7f 11 83 f9 01 74 2f > 83 f9 02 74 3c 83 f9 > Jan 10 18:39:17 zcache kernel: RSP: 0018:ffffccb5e1d17cb0 EFLAGS: 00010282 > Jan 10 18:39:17 zcache kernel: RAX: ffffccb5e1d17ce2 RBX: > ffff8a0f37682060 RCX: 0000000000000013 > Jan 10 18:39:17 zcache kernel: RDX: ffff8a0f0cd12588 RSI: > 0000000000000000 RDI: 0000000000000000 > Jan 10 18:39:17 zcache kernel: RBP: 0000000000000001 R08: > 0000000000000000 R09: 0000000000000000 > Jan 10 18:39:17 zcache kernel: R10: 0000000000000000 R11: > ffffffff999362b0 R12: 0000000000000001 > Jan 10 18:39:17 zcache kernel: R13: 0000000000000000 R14: > ffff8a0f37682060 R15: ffff8a0f0cd12588 > Jan 10 18:39:17 zcache kernel: FS: 0000000000000000(0000) > GS:ffff8a12b18fe000(0000) knlGS:0000000000000000 > Jan 10 18:39:17 zcache kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > Zac Bowling > > On Sun, Jan 4, 2026 at 4:27 PM Zac Bowling <zbowling@gmail.com> wrote: > > > > Add NULL pointer checks for mt792x_sta_to_link() and mt792x_vif_to_link() > > results in critical paths to prevent kernel crashes during MLO operations. > > > > Functions fixed: > > - mt7925_mac_link_sta_add(): Check mlink and mconf before dereferencing > > - mt7925_conf_tx(): Check mconf before accessing queue_params > > > > These can be NULL during MLO link setup/teardown when mac80211 state > > may not be fully synchronized with driver state. > > > > Found through static analysis and pattern matching. > > > > Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > --- > > drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > index 7cf6faa1f6f4..81373e479abd 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > @@ -871,12 +871,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > > > msta = (struct mt792x_sta *)link_sta->sta->drv_priv; > > mlink = mt792x_sta_to_link(msta, link_id); > > + if (!mlink) > > + return -EINVAL; > > > > idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > > if (idx < 0) > > return -ENOSPC; > > > > mconf = mt792x_vif_to_link(mvif, link_id); > > + if (!mconf) > > + return -EINVAL; > > + > > mt76_wcid_init(&mlink->wcid, 0); > > mlink->wcid.sta = 1; > > mlink->wcid.idx = idx; > > @@ -1735,6 +1740,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > [IEEE80211_AC_BK] = 1, > > }; > > > > + if (!mconf) > > + return -EINVAL; > > + > > /* firmware uses access class index */ > > mconf->queue_params[mq_to_aci[queue]] = *params; > > > > -- > > 2.51.0 > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling ` (16 preceding siblings ...) 2026-01-05 0:26 ` [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac Bowling @ 2026-01-16 0:15 ` Sean Wang 2026-01-16 0:43 ` Zac Bowling ` (2 more replies) 17 siblings, 3 replies; 113+ messages in thread From: Sean Wang @ 2026-01-16 0:15 UTC (permalink / raw) To: Zac Bowling Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi Zac, Thanks for sharing this series. Overall the patches look good to me, and I’m continuing more testing to ensure there are no regressions on mt7925 and mt7921 further But today I do hit a kernel WARN in the disconnect path (mac80211 BA session teardown) while testing v3 of the series [ 3373.120224] Hardware name: HP HP EliteBook 830 G6/854A, BIOS R70 Ver. 01.22.00 10/14/2022 [ 3373.120228] Workqueue: events_unbound cfg80211_wiphy_work [cfg80211] [ 3373.120367] RIP: 0010:__ieee80211_stop_tx_ba_session+0x295/0x350 [mac80211] [ 3373.120570] Code: 11 0f 83 a3 00 00 00 48 c7 80 90 03 00 00 00 00 00 00 48 8b 7d 98 e8 4a 26 f3 fa 4c 89 ee 4c 89 ef e8 6f 16 0b fa 31 c0 eb 93 <0f> 0b 31 c0 eb 8d b8 8e ff ff ff eb 86 48 8b 7d 98 e8 25 26 f3 fa [ 3373.120576] RSP: 0018:ffffd00902ed7ba0 EFLAGS: 00010206 [ 3373.120583] RAX: 0000000000010003 RBX: 0000000000000003 RCX: 0000000000000000 [ 3373.120587] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 3373.120591] RBP: ffffd00902ed7c10 R08: 0000000000000000 R09: 0000000000000000 [ 3373.120596] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 3373.120599] R13: ffff8a8433717540 R14: ffff8a83e0b20960 R15: ffff8a834d42c000 [ 3373.120604] FS: 0000000000000000(0000) GS:ffff8a8477b03000(0000) knlGS:0000000000000000 [ 3373.120608] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3373.120626] CR2: 00007b9e0a8ba0d0 CR3: 000000009a440005 CR4: 00000000003726f0 [ 3373.120631] Call Trace: [ 3373.120656] <TASK> [ 3373.120664] ieee80211_sta_tear_down_BA_sessions+0x53/0xe0 [mac80211] [ 3373.120836] __sta_info_destroy_part1+0x48/0x550 [mac80211] [ 3373.120994] __sta_info_flush+0x10e/0x230 [mac80211] [ 3373.121150] ieee80211_set_disassoc+0x6b3/0x900 [mac80211] [ 3373.121293] ? _printk+0x5f/0x90 [ 3373.121330] __ieee80211_disconnect+0xd6/0x1a0 [mac80211] [ 3373.121446] ieee80211_beacon_connection_loss_work+0x6d/0xc0 [mac80211] [ 3373.121573] cfg80211_wiphy_work+0xb4/0x190 [cfg80211] [ 3373.121779] process_one_work+0x191/0x3e0 [ 3373.121789] worker_thread+0x2e3/0x420 [ 3373.121796] ? __pfx_worker_thread+0x10/0x10 [ 3373.121802] kthread+0x10d/0x230 [ 3373.121810] ? __pfx_kthread+0x10/0x10 [ 3373.121818] ret_from_fork+0x205/0x230 [ 3373.121826] ? __pfx_kthread+0x10/0x10 [ 3373.121832] ret_from_fork_asm+0x1a/0x30 [ 3373.121842] </TASK> [ 3373.121844] ---[ end trace 0000000000000000 ]--- [ 3373.128750] ------------[ cut here ]------------ [ 3373.128757] WARNING: CPU: 1 PID: 14854 at net/mac80211/agg-tx.c:398 __ieee80211_stop_tx_ba_session+0x295/0x350 [mac80211] I’m currently bisecting the series to identify which patch triggers it and will follow up once I have clearer results. Thanks again for the work and the DKMS setup. Sean On Sun, Jan 4, 2026 at 6:27 PM Zac Bowling <zbowling@gmail.com> wrote: > > From: Zac Bowling <zac@zacbowling.com> > > This patch series addresses kernel panics, system deadlocks, and various > stability issues in the MT7925 WiFi driver. The issues were discovered on > kernel 6.17 (Ubuntu 25.10) and fixes were developed and tested on 6.18.2. > > These patches are based on the wireless tree (nbd168/wireless.git) as > requested by Sean Wang. > > == Problem Description == > > The MT7925 driver has several bugs that cause: > - Kernel NULL pointer dereferences during BSSID roaming > - System-wide deadlocks requiring hard reboot > - Firmware reload failures after suspend/resume > - Key removal errors during MLO roaming > > These issues manifest approximately every 5 minutes when the adapter > tries to switch to a better BSSID, particularly in enterprise environments > with multiple access points. > > == Root Causes == > > 1. Missing mutex protection around ieee80211_iterate_active_interfaces() > when the callback invokes MCU functions (patches 2, 3, 16) > > 2. NULL pointer dereferences where mt792x_vif_to_bss_conf(), > mt792x_sta_to_link(), and similar functions return NULL during > MLO state transitions but results are not checked (patches 1, 4, 5, > 9, 10, 14, 17) > > 3. Ignored MCU return values hiding firmware errors (patches 6, 7, 8) > > 4. WARN_ON_ONCE used where NULL is expected during normal MLO AP > setup (patch 13) > > 5. Firmware semaphore not released after failed load attempts (patch 15) > > 6. Key removal returning error when link is already torn down (patch 12) > > == Testing == > > Stress tested by hammering the driver with custom test script. > > Tested on: > - Framework Desktop (AMD Ryzen AI Max 300 Series) with MT7925 (RZ717) > - This whole patch series was tested on Kernel 6.18.2 and 6.17.12 (Ubuntu 25.10) > - Enterprise WiFi environment with multiple WIFI 7 APs with MLO enabled > > Before patches: System hangs/panics every 5-15 minutes during BSSID roaming > After patches: Stable for 24+ hours under continuous stress testing > > == Crash Traces Fixed == > > Primary NULL pointer dereference: > BUG: kernel NULL pointer dereference, address: 0000000000000010 > Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] > RIP: 0010:mt76_connac_mcu_uni_add_dev+0x9c/0x780 [mt76_connac_lib] > Call Trace: > mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] > __iterate_interfaces+0x92/0x130 [mac80211] > ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] > mt7925_mac_reset_work+0x105/0x190 [mt7925_common] > > Deadlock trace: > INFO: task kworker/u128:0:48737 blocked for more than 122 seconds. > Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] > Call Trace: > __mutex_lock.constprop.0+0x3d0/0x6d0 > mt7925_mac_reset_work+0x85/0x170 [mt7925_common] > > == Related Links == > > Framework Community discussion: > https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 > > OpenWrt GitHub issues: > https://github.com/openwrt/mt76/issues/1014 > https://github.com/openwrt/mt76/issues/1036 > > GitHub repository with additional analysis: > https://github.com/zbowling/mt7925 > > Zac Bowling (17): > wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration > wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort > wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM > wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions > wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c > wifi: mt76: mt7925: add error handling for AMPDU MCU commands > wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add > wifi: mt76: mt7925: add error handling for BSS info in key setup > wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions > wifi: mt76: mt792x: fix NULL pointer dereference in TX path > wifi: mt76: mt7925: add lockdep assertions for mutex verification > wifi: mt76: mt7925: fix key removal failure during MLO roaming > wifi: mt76: mt7925: fix kernel warning in MLO ROC setup > wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions > wifi: mt76: mt792x: fix firmware reload failure after previous load crash > wifi: mt76: mt7925: add mutex protection in resume path > wifi: mt76: mt7925: add NULL checks in link station and TX queue setup > > drivers/net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++- > drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 8 +++++ > drivers/net/wireless/mediatek/mt76/mt7925/main.c | 95 +++++++++++++++++++++--- > drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 52 ++++++++++++++--- > drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 6 +++ > 5 files changed, 170 insertions(+), 18 deletions(-) > > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang @ 2026-01-16 0:43 ` Zac Bowling 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac 2 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-16 0:43 UTC (permalink / raw) To: Sean Wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang Hi Sean, Thanks for testing this and catching that WARN. Good catch. Yeah, that was my bug. One of my attempts to handle all error returns that my static analyzer said was unhandled meant I didn't actually hit a required callback because I early returned too soon. Patched it locally already and it's my repo. Will send in just a sec after my poor-mans stress finishes running tests. I found another bug this morning too, I need to send with device resets coming out of suspend and corrupted list from the past initialization. Zac Bowling On Thu, Jan 15, 2026 at 4:15 PM Sean Wang <sean.wang@kernel.org> wrote: > > Hi Zac, > > Thanks for sharing this series. Overall the patches look good to me, > and I’m continuing more testing to ensure there are no regressions on > mt7925 and mt7921 further > But today I do hit a kernel WARN in the disconnect path (mac80211 BA > session teardown) while testing v3 of the series > > [ 3373.120224] Hardware name: HP HP EliteBook 830 G6/854A, BIOS R70 > Ver. 01.22.00 10/14/2022 > [ 3373.120228] Workqueue: events_unbound cfg80211_wiphy_work [cfg80211] > [ 3373.120367] RIP: 0010:__ieee80211_stop_tx_ba_session+0x295/0x350 [mac80211] > [ 3373.120570] Code: 11 0f 83 a3 00 00 00 48 c7 80 90 03 00 00 00 00 > 00 00 48 8b 7d 98 e8 4a 26 f3 fa 4c 89 ee 4c 89 ef e8 6f 16 0b fa 31 > c0 eb 93 <0f> 0b 31 c0 eb 8d b8 8e ff ff ff eb 86 48 8b 7d 98 e8 25 26 > f3 fa > [ 3373.120576] RSP: 0018:ffffd00902ed7ba0 EFLAGS: 00010206 > [ 3373.120583] RAX: 0000000000010003 RBX: 0000000000000003 RCX: 0000000000000000 > [ 3373.120587] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > [ 3373.120591] RBP: ffffd00902ed7c10 R08: 0000000000000000 R09: 0000000000000000 > [ 3373.120596] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > [ 3373.120599] R13: ffff8a8433717540 R14: ffff8a83e0b20960 R15: ffff8a834d42c000 > [ 3373.120604] FS: 0000000000000000(0000) GS:ffff8a8477b03000(0000) > knlGS:0000000000000000 > [ 3373.120608] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3373.120626] CR2: 00007b9e0a8ba0d0 CR3: 000000009a440005 CR4: 00000000003726f0 > [ 3373.120631] Call Trace: > [ 3373.120656] <TASK> > [ 3373.120664] ieee80211_sta_tear_down_BA_sessions+0x53/0xe0 [mac80211] > [ 3373.120836] __sta_info_destroy_part1+0x48/0x550 [mac80211] > [ 3373.120994] __sta_info_flush+0x10e/0x230 [mac80211] > [ 3373.121150] ieee80211_set_disassoc+0x6b3/0x900 [mac80211] > [ 3373.121293] ? _printk+0x5f/0x90 > [ 3373.121330] __ieee80211_disconnect+0xd6/0x1a0 [mac80211] > [ 3373.121446] ieee80211_beacon_connection_loss_work+0x6d/0xc0 [mac80211] > [ 3373.121573] cfg80211_wiphy_work+0xb4/0x190 [cfg80211] > [ 3373.121779] process_one_work+0x191/0x3e0 > [ 3373.121789] worker_thread+0x2e3/0x420 > [ 3373.121796] ? __pfx_worker_thread+0x10/0x10 > [ 3373.121802] kthread+0x10d/0x230 > [ 3373.121810] ? __pfx_kthread+0x10/0x10 > [ 3373.121818] ret_from_fork+0x205/0x230 > [ 3373.121826] ? __pfx_kthread+0x10/0x10 > [ 3373.121832] ret_from_fork_asm+0x1a/0x30 > [ 3373.121842] </TASK> > [ 3373.121844] ---[ end trace 0000000000000000 ]--- > [ 3373.128750] ------------[ cut here ]------------ > [ 3373.128757] WARNING: CPU: 1 PID: 14854 at net/mac80211/agg-tx.c:398 > __ieee80211_stop_tx_ba_session+0x295/0x350 [mac80211] > > I’m currently bisecting the series to identify which patch triggers it > and will follow up once I have clearer results. > Thanks again for the work and the DKMS setup. > > Sean > > On Sun, Jan 4, 2026 at 6:27 PM Zac Bowling <zbowling@gmail.com> wrote: > > > > From: Zac Bowling <zac@zacbowling.com> > > > > This patch series addresses kernel panics, system deadlocks, and various > > stability issues in the MT7925 WiFi driver. The issues were discovered on > > kernel 6.17 (Ubuntu 25.10) and fixes were developed and tested on 6.18.2. > > > > These patches are based on the wireless tree (nbd168/wireless.git) as > > requested by Sean Wang. > > > > == Problem Description == > > > > The MT7925 driver has several bugs that cause: > > - Kernel NULL pointer dereferences during BSSID roaming > > - System-wide deadlocks requiring hard reboot > > - Firmware reload failures after suspend/resume > > - Key removal errors during MLO roaming > > > > These issues manifest approximately every 5 minutes when the adapter > > tries to switch to a better BSSID, particularly in enterprise environments > > with multiple access points. > > > > == Root Causes == > > > > 1. Missing mutex protection around ieee80211_iterate_active_interfaces() > > when the callback invokes MCU functions (patches 2, 3, 16) > > > > 2. NULL pointer dereferences where mt792x_vif_to_bss_conf(), > > mt792x_sta_to_link(), and similar functions return NULL during > > MLO state transitions but results are not checked (patches 1, 4, 5, > > 9, 10, 14, 17) > > > > 3. Ignored MCU return values hiding firmware errors (patches 6, 7, 8) > > > > 4. WARN_ON_ONCE used where NULL is expected during normal MLO AP > > setup (patch 13) > > > > 5. Firmware semaphore not released after failed load attempts (patch 15) > > > > 6. Key removal returning error when link is already torn down (patch 12) > > > > == Testing == > > > > Stress tested by hammering the driver with custom test script. > > > > Tested on: > > - Framework Desktop (AMD Ryzen AI Max 300 Series) with MT7925 (RZ717) > > - This whole patch series was tested on Kernel 6.18.2 and 6.17.12 (Ubuntu 25.10) > > - Enterprise WiFi environment with multiple WIFI 7 APs with MLO enabled > > > > Before patches: System hangs/panics every 5-15 minutes during BSSID roaming > > After patches: Stable for 24+ hours under continuous stress testing > > > > == Crash Traces Fixed == > > > > Primary NULL pointer dereference: > > BUG: kernel NULL pointer dereference, address: 0000000000000010 > > Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] > > RIP: 0010:mt76_connac_mcu_uni_add_dev+0x9c/0x780 [mt76_connac_lib] > > Call Trace: > > mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] > > __iterate_interfaces+0x92/0x130 [mac80211] > > ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] > > mt7925_mac_reset_work+0x105/0x190 [mt7925_common] > > > > Deadlock trace: > > INFO: task kworker/u128:0:48737 blocked for more than 122 seconds. > > Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] > > Call Trace: > > __mutex_lock.constprop.0+0x3d0/0x6d0 > > mt7925_mac_reset_work+0x85/0x170 [mt7925_common] > > > > == Related Links == > > > > Framework Community discussion: > > https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 > > > > OpenWrt GitHub issues: > > https://github.com/openwrt/mt76/issues/1014 > > https://github.com/openwrt/mt76/issues/1036 > > > > GitHub repository with additional analysis: > > https://github.com/zbowling/mt7925 > > > > Zac Bowling (17): > > wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration > > wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort > > wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM > > wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions > > wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c > > wifi: mt76: mt7925: add error handling for AMPDU MCU commands > > wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add > > wifi: mt76: mt7925: add error handling for BSS info in key setup > > wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions > > wifi: mt76: mt792x: fix NULL pointer dereference in TX path > > wifi: mt76: mt7925: add lockdep assertions for mutex verification > > wifi: mt76: mt7925: fix key removal failure during MLO roaming > > wifi: mt76: mt7925: fix kernel warning in MLO ROC setup > > wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions > > wifi: mt76: mt792x: fix firmware reload failure after previous load crash > > wifi: mt76: mt7925: add mutex protection in resume path > > wifi: mt76: mt7925: add NULL checks in link station and TX queue setup > > > > drivers/net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++- > > drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 8 +++++ > > drivers/net/wireless/mediatek/mt76/mt7925/main.c | 95 +++++++++++++++++++++--- > > drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 52 ++++++++++++++--- > > drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 6 +++ > > 5 files changed, 170 insertions(+), 18 deletions(-) > > > > -- > > 2.51.0 > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang 2026-01-16 0:43 ` Zac Bowling @ 2026-01-16 1:04 ` Zac 2026-01-16 1:04 ` [PATCH v4 01/21] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac ` (20 more replies) 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac 2 siblings, 21 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:04 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac This series addresses stability issues in the mt7925 (WiFi 7) and mt7921 drivers, focusing on NULL pointer dereferences, mutex protection, and MLO (Multi-Link Operation) handling. Changes since v3: - Added mt7921 driver fixes (patches 18-19) to address mutex handling issues that also affected the older driver - Fixed mutex deadlocks in mt7921 suspend paths - the mutex was being acquired inside functions that were already called with mutex held - Added mt76 core fix for list corruption in mt76_wcid_cleanup (patch 20) that caused crashes during suspend/resume cycles - Added fix for BA session teardown during beacon loss (patch 21) which was causing mac80211 WARN in __ieee80211_stop_tx_ba_session - reported by Sean Wang The mt7921 mutex fixes (patches 18-19) correct improper mutex nesting where mt7921_roc_abort_sync() and mt7921_set_runtime_pm() were acquiring the mutex internally, but were called from paths that already held it (e.g., mt7921_mac_sta_remove via mt76_sta_remove, suspend handlers). The list corruption fix (patch 20) addresses a bug where mt76_wcid_cleanup() wasn't removing wcid entries from sta_poll_list before mt76_reset_device() reinitialized the master list, leaving stale pointers. The BA session fix (patch 21) makes the ieee80211_stop_tx_ba_cb_irqsafe() callback unconditional in IEEE80211_AMPDU_TX_STOP_CONT handling - the MCU command may fail during beacon loss but mac80211 must still be notified to complete the session teardown. More notes in https://github.com/zbowling/mt7925 Tested on MT7925 (RZ616) with kernel 6.18.5. Zac Bowling (21): wifi: mt76: mt7921: fix missing mutex protection in multiple paths wifi: mt76: mt7921: fix mutex deadlocks in multiple paths wifi: mt76: fix list corruption in mt76_wcid_cleanup wifi: mt76: mt7925: fix BA session teardown during beacon loss wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c wifi: mt76: mt7925: add error handling for AMPDU MCU commands wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add wifi: mt76: mt7925: add error handling for BSS info in key setup wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions wifi: mt76: mt792x: fix NULL pointer dereference in TX path wifi: mt76: mt7925: add lockdep assertions for mutex verification wifi: mt76: mt7925: fix key removal failure during MLO roaming wifi: mt76: mt7925: fix kernel warning in MLO ROC setup wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions wifi: mt76: mt792x: fix firmware reload failure after previous load crash wifi: mt76: mt7925: add mutex protection in resume path wifi: mt76: mt7925: add NULL checks in link station and TX queue setup drivers/net/wireless/mediatek/mt76/mac80211.c | 10 ++ .../net/wireless/mediatek/mt76/mt7921/mac.c | 2 + .../net/wireless/mediatek/mt76/mt7921/main.c | 8 ++ .../net/wireless/mediatek/mt76/mt7921/pci.c | 2 + .../net/wireless/mediatek/mt76/mt7921/sdio.c | 2 + .../net/wireless/mediatek/mt76/mt7925/mac.c | 8 ++ .../net/wireless/mediatek/mt76/mt7925/main.c | 125 ++++++++++++++---- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 48 +++++-- .../net/wireless/mediatek/mt76/mt7925/pci.c | 4 + .../net/wireless/mediatek/mt76/mt792x_core.c | 27 +++- 10 files changed, 203 insertions(+), 33 deletions(-) -- 2.52.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v4 01/21] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac @ 2026-01-16 1:04 ` Zac 2026-01-16 1:05 ` [PATCH v4 02/21] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac ` (19 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:04 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> mt792x_vif_to_bss_conf() can return NULL when iterating over valid_links during HW reset or other state transitions, because the link configuration in mac80211 may not be set up yet even though the driver's valid_links bitmap has the link marked as valid. This causes a NULL pointer dereference in mt76_connac_mcu_uni_add_dev() when it tries to access bss_conf->vif->type, and similar crashes in other functions that use bss_conf without checking. This crash was observed on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 (RZ717) running kernel 6.17. The panic occurs during BSSID roaming when the adapter attempts to switch to a better access point: BUG: kernel NULL pointer dereference, address: 0000000000000010 CPU: 1 UID: 0 PID: 8362 Comm: kworker/u128:10 Tainted: G OE Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] RIP: 0010:mt76_connac_mcu_uni_add_dev+0x9c/0x780 [mt76_connac_lib] Call Trace: mt7925_vif_connect_iter+0xcb/0x240 [mt7925_common] __iterate_interfaces+0x92/0x130 [mac80211] ieee80211_iterate_interfaces+0x3d/0x60 [mac80211] mt7925_mac_reset_work+0x105/0x190 [mt7925_common] process_one_work+0x18b/0x370 worker_thread+0x317/0x450 The issue manifests approximately every 5 minutes when the adapter tries to hop to a better BSSID, causing system-wide hangs where network commands (ip, ifconfig, etc.) hang indefinitely. Add NULL checks for bss_conf before using it in: - mt7925_vif_connect_iter() - mt7925_change_vif_links() - mt7925_mac_sta_assoc() - mt7925_mac_sta_remove_links() Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Link: https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 6 ++++++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 871b671019..184efe8afa 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 2d358a9664..3001a62a8b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1304,6 +1304,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } mt792x_mutex_release(dev); @@ -1630,6 +1632,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1861,6 +1865,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1876,6 +1882,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 02/21] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac 2026-01-16 1:04 ` [PATCH v4 01/21] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 03/21] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac ` (18 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> During firmware recovery and ROC (Remain On Channel) abort operations, the driver iterates over active interfaces and calls MCU functions that require the device mutex to be held, but the mutex was not acquired. This causes system-wide deadlocks where the system becomes completely unresponsive. From logs on affected systems: INFO: task kworker/u128:0:48737 blocked for more than 122 seconds. Workqueue: mt76 mt7925_mac_reset_work [mt7925_common] Call Trace: __schedule+0x426/0x12c0 schedule+0x27/0xf0 schedule_preempt_disabled+0x15/0x30 __mutex_lock.constprop.0+0x3d0/0x6d0 mt7925_mac_reset_work+0x85/0x170 [mt7925_common] The deadlock manifests approximately every 5 minutes when the adapter tries to hop to a better BSSID, triggering firmware reset. Network commands (ip, ifconfig, etc.) hang indefinitely, processes get stuck in uninterruptible sleep (D state), and reboot hangs as well. Add mutex protection around interface iteration in: - mt7925_mac_reset_work(): Called during firmware recovery after MCU timeouts to reconnect all interfaces - mt7925_roc_abort_sync() in suspend path: Called during suspend to clean up Remain On Channel operations This matches the pattern used in mt7615 and other MediaTek drivers where interface iteration callbacks invoke MCU functions with mutex held: // mt7615/main.c - roc_work has mutex protection mt7615_mutex_acquire(phy->dev); ieee80211_iterate_active_interfaces(...); mt7615_mutex_release(phy->dev); Note: Sean Wang from MediaTek has submitted an alternative fix for the ROC path using cancel_delayed_work() instead of cancel_delayed_work_sync(). Both approaches address the deadlock; this one adds explicit mutex protection which may be superseded by the upstream fix. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Link: https://community.frame.work/t/kernel-panic-from-wifi-mediatek-mt7925-nullptr-dereference/79301 Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 184efe8afa..06420ac6ed 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1331,9 +1331,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index c4161754c0..e9d62c6aee 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -455,7 +455,9 @@ static int mt7925_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7925_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 03/21] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac 2026-01-16 1:04 ` [PATCH v4 01/21] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac 2026-01-16 1:05 ` [PATCH v4 02/21] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 04/21] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac ` (17 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Two additional code paths iterate over active interfaces and call MCU functions without proper mutex protection: 1. mt7925_set_runtime_pm(): Called when runtime PM settings change. The callback mt7925_pm_interface_iter() calls mt7925_mcu_set_beacon_filter() which in turn calls mt7925_mcu_set_rxfilter(). These MCU functions require the device mutex to be held. 2. mt7925_mlo_pm_work(): A workqueue function for MLO power management. The callback mt7925_mlo_pm_iter() was acquiring mutex internally, which is inconsistent with the rest of the driver where the caller holds the mutex during interface iteration. These bugs can cause deadlocks when: - Power management settings are changed while WiFi is active - MLO power save state transitions occur during roaming Move the mutex to the caller in mt7925_mlo_pm_work() for consistency with the rest of the driver, and add mutex protection in mt7925_set_runtime_pm(). Found through static analysis (clang-tidy) and comparison with the MT7615 driver which correctly acquires mutex before interface iteration. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 3001a62a8b..9f17b21aef 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -751,9 +751,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } @@ -1301,14 +1303,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) return; - mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); if (!bss_conf) continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } - mt792x_mutex_release(dev); } void mt7925_mlo_pm_work(struct work_struct *work) @@ -1317,9 +1317,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) mlo_pm_work.work); struct ieee80211_hw *hw = mt76_hw(dev); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_mlo_pm_iter, dev); + mt792x_mutex_release(dev); } void mt7925_scan_work(struct work_struct *work) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 04/21] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (2 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 03/21] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 05/21] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac ` (16 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add NULL pointer checks for link_conf and mconf in: - mt7925_mcu_sta_phy_tlv(): builds PHY capability TLV for station record - mt7925_mcu_sta_rate_ctrl_tlv(): builds rate control TLV for station record Both functions call mt792x_vif_to_bss_conf() and mt792x_vif_to_link() which can return NULL during MLO link state transitions when the link configuration in mac80211 is not yet synchronized with the driver's link tracking. Without these checks, the driver will crash with a NULL pointer dereference when accessing link_conf->chanreq.oper or link_conf->basic_rates. Found through static analysis (clang-tidy pattern matching for unchecked return values from functions known to return NULL). Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index cf0fdea45c..d61a7fbda7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1773,6 +1773,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1851,6 +1855,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 05/21] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (3 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 04/21] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 06/21] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac ` (15 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add NULL pointer checks throughout main.c for functions that call mt792x_vif_to_bss_conf(), mt792x_vif_to_link(), and mt792x_sta_to_link() without verifying the return value before dereferencing. Functions fixed: - mt7925_set_key(): Check link_conf, mconf, and mlink before use - mt7925_mac_link_sta_add(): Check link_conf before BSS info update - mt7925_mac_link_sta_assoc(): Check mlink and link_conf before use - mt7925_mac_link_sta_remove(): Check mlink and link_conf, add goto label for proper cleanup path - mt7925_change_vif_links(): Check link_conf before adding BSS These functions can receive NULL when the link configuration in mac80211 is not yet synchronized with the driver's link tracking during MLO operations or state transitions. Without these checks, the driver crashes during station add/remove/ association operations with NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000010 Call Trace: mt7925_mac_link_sta_add+0x... ... Found through static analysis and triggered during BSSID roaming on systems with multiple access points. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 27 ++++++++++++++++--- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 9f17b21aef..7d3322461b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -604,6 +604,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) + return -EINVAL; + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -889,6 +893,8 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -1034,6 +1040,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1043,12 +1051,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1095,6 +1104,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1108,10 +1119,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1119,6 +1132,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -2031,6 +2045,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 06/21] wifi: mt76: mt7925: add error handling for AMPDU MCU commands 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (4 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 05/21] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 07/21] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac ` (14 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Check return values of mt7925_mcu_uni_rx_ba() and mt7925_mcu_uni_tx_ba() in mt7925_ampdu_action() and propagate errors to the caller. Previously, failures in these MCU commands were silently ignored, which could leave block aggregation in an inconsistent state between the driver and firmware. For IEEE80211_AMPDU_TX_STOP_CONT, only call the completion callback ieee80211_stop_tx_ba_cb_irqsafe() if the MCU command succeeded, to avoid signaling completion when the firmware operation failed. Found through code review - pattern of ignored return values throughout AMPDU handling. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7d3322461b..d966e5ab50 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1271,22 +1271,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1295,8 +1295,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 07/21] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (5 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 06/21] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 08/21] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac ` (13 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Check return value of mt7925_mcu_add_bss_info() in mt7925_mac_link_sta_add() and propagate errors to the caller. BSS info must be set up before adding a station record. If this MCU command fails, continuing with station add would leave the firmware in an inconsistent state with a station but no BSS configuration. This can cause undefined behavior in the firmware, particularly during MLO link setup where multiple BSS configurations are being programmed. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index d966e5ab50..a7e1e673c4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -899,11 +899,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 08/21] wifi: mt76: mt7925: add error handling for BSS info in key setup 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (6 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 07/21] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 09/21] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac ` (12 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Check return value of mt7925_mcu_add_bss_info() in mt7925_set_link_key() when setting up cipher for the first time and propagate errors. The BSS info update with cipher information must succeed before key programming can proceed. If this MCU command fails, continuing with key setup would program keys into the firmware for a BSS that does not have the correct cipher configuration. SECURITY NOTE: Silent failure here is particularly dangerous because the user would believe encryption is active when the firmware may not have the cipher properly configured, potentially resulting in unencrypted or incorrectly encrypted traffic. This ensures the error is propagated up the stack rather than silently ignored. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index a7e1e673c4..058394b2e0 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -637,8 +637,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 09/21] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (7 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 08/21] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 10/21] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac ` (11 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add NULL pointer checks for mconf and link_conf in several functions that were missing validation after calling mt792x_vif_to_link() and mt792x_vif_to_bss_conf(). Functions fixed: - mt7925_mac_set_links(): Check both primary and secondary link_conf before dereferencing chanreq.oper for band selection - mt7925_link_info_changed(): Check mconf before using it to get link_conf, prevents NULL dereference chain - mt7925_assign_vif_chanctx(): Check mconf before use, return -EINVAL if NULL; check pri_link_conf before passing to MCU function - mt7925_unassign_vif_chanctx(): Check mconf before dereferencing, return early if NULL during MLO cleanup These functions handle MLO (Multi-Link Operation) scenarios where link configurations may not be fully set up when called, particularly during rapid link state transitions or error recovery paths. Prevents panics during WiFi 7 MLO link setup and teardown sequences. Reported-by: Zac Bowling <zac@zacbowling.com> Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 39 +++++++++++++++---- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 058394b2e0..852cf8ff84 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1006,18 +1006,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; + + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; + + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1927,7 +1938,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2136,9 +2152,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2167,6 +2188,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 10/21] wifi: mt76: mt792x: fix NULL pointer dereference in TX path 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (8 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 09/21] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 11/21] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac ` (10 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add NULL pointer checks in mt792x_tx() to prevent kernel crashes when transmitting packets during MLO link removal. The function calls mt792x_sta_to_link() which can return NULL if the link is being removed, but the return value was dereferenced without checking. Similarly, the RCU-protected link_conf and link_sta pointers were used without NULL validation. This race can occur when: 1. A packet is queued for transmission 2. Concurrently, the link is being removed (mt7925_mac_link_sta_remove) 3. mt792x_sta_to_link() returns NULL for the removed link 4. Kernel crashes on wcid = &mlink->wcid dereference Example crash trace: BUG: kernel NULL pointer dereference RIP: mt792x_tx+0x... Call Trace: ieee80211_tx+0x... __ieee80211_subif_start_xmit+0x... Fix by: - Check mlink return value before dereferencing wcid - Check RCU-dereferenced conf and link_sta before use - Free the SKB and return early if any pointer is NULL This affects both MT7921 and MT7925 drivers as mt792x_core.c is shared. Fixes: c74df1c067f2 ("wifi: mt76: mt792x: introduce mt792x-lib module") Reported-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt792x_core.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index f2ed16feb6..9dc768aa8b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, IEEE80211_TX_CTRL_MLO_LINK); sta = (struct mt792x_sta *)control->sta->drv_priv; mlink = mt792x_sta_to_link(sta, link_id); + if (!mlink) + goto free_skb; wcid = &mlink->wcid; } @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, link_id = wcid->link_id; rcu_read_lock(); conf = rcu_dereference(vif->link_conf[link_id]); - memcpy(hdr->addr2, conf->addr, ETH_ALEN); - link_sta = rcu_dereference(control->sta->link[link_id]); + if (!conf || !link_sta) { + rcu_read_unlock(); + goto free_skb; + } + memcpy(hdr->addr2, conf->addr, ETH_ALEN); memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); if (vif->type == NL80211_IFTYPE_STATION) @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, } mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); + return; + +free_skb: + ieee80211_free_txskb(hw, skb); } EXPORT_SYMBOL_GPL(mt792x_tx); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 11/21] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (9 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 10/21] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 12/21] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac ` (9 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. The lockdep assertions will trigger warnings at runtime if code paths exist that call these functions without proper mutex protection. This aids in detecting the class of bugs fixed by patches in this series. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index d61a7fbda7..958ff9da9f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1527,6 +1527,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2037,6 +2039,8 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, struct mt792x_sta *msta; struct mt792x_link_sta *mlink; + lockdep_assert_held(&dev->mt76.mutex); + if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); @@ -2843,6 +2847,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 12/21] wifi: mt76: mt7925: fix key removal failure during MLO roaming 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (10 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 11/21] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 13/21] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac ` (8 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> During MLO roaming, mac80211 may request key removal after the link state has already been torn down. The current code returns -EINVAL when link_conf, mconf, or mlink is NULL, causing 'failed to remove key from hardware (-22)' errors in the kernel log. This is a race condition where: 1. MLO link teardown begins, cleaning up driver state 2. mac80211 requests group key removal for the old link 3. mt792x_vif_to_bss_conf() or related functions return NULL 4. Driver returns -EINVAL, confusing upper layers Observed kernel log errors during roaming: wlp192s0: failed to remove key (1, ff:ff:ff:ff:ff:ff) from hardware (-22) wlp192s0: failed to remove key (4, ff:ff:ff:ff:ff:ff) from hardware (-22) And associated wpa_supplicant warnings: nl80211: kernel reports: link ID must for MLO group key The fix: When removing a key (cmd != SET_KEY), if the link state is already gone, return success (0) instead of error. The key is effectively removed when the link was torn down. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Reported-by: Zac Bowling <zac@zacbowling.com> Tested-by: Zac Bowling <zac@zacbowling.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 852cf8ff84..7cf6faa1f6 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -605,8 +605,15 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); - if (!link_conf || !mconf || !mlink) + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; return -EINVAL; + } wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 13/21] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (11 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 12/21] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 14/21] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac ` (7 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> mt7925_mcu_set_mlo_roc() uses WARN_ON_ONCE() to check if link_conf or channel is NULL. However, during MLO AP setup, it's normal for the channel to not be configured yet when this function is called. The WARN_ON_ONCE triggers a kernel warning/oops that makes the system appear to have crashed, even though it's just a timing issue. Example kernel oops during AP setup: WARNING: CPU: 0 PID: 12345 at drivers/net/wireless/mediatek/mt76/mt7925/mcu.c:1345 Call Trace: mt7925_mcu_set_mlo_roc+0x... mt7925_remain_on_channel+0x... Replace WARN_ON_ONCE with regular NULL checks and return -ENOLINK to indicate the link is not fully configured yet. This allows the upper layers to retry when the link is ready, without spamming the kernel log with warnings. Also add a check for mconf in the first loop to match the pattern used in the second loop, preventing potential NULL dereference. This fixes kernel oops reported during MLO AP setup on OpenWrt with MT7925E hardware and similar issues on standard Linux distributions. Fixes: c5d11e4a9fa8 ("wifi: mt76: mt7925: add mt7925_change_vif_links") Link: https://github.com/openwrt/mt76/issues/1014 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 958ff9da9f..8080fea30d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1337,15 +1337,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1359,8 +1367,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 14/21] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (12 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 13/21] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 15/21] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac ` (6 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Several MCU functions dereference pointers returned by mt792x_sta_to_link() and mt792x_vif_to_link() without checking for NULL. During MLO state transitions, these functions can return NULL when link state is being set up or torn down, causing kernel NULL pointer dereferences. Add NULL checks in the following functions: - mt7925_mcu_sta_hdr_trans_tlv(): Check mlink before dereferencing wcid - mt7925_mcu_wtbl_update_hdr_trans(): Check mlink and mconf before use - mt7925_mcu_sta_amsdu_tlv(): Check mlink before setting amsdu flag - mt7925_mcu_sta_mld_tlv(): Check mconf and mlink in link iteration loop - mt7925_mcu_sta_update(): Initialize mlink to NULL and check both link_sta and mlink in the ternary condition These race conditions can occur during: - MLO link setup/teardown - Station add/remove operations - Firmware command generation during state transitions Found through static analysis (clang-tidy) and pattern matching similar to fixes in mt7996 and ath12k drivers for MLO link state handling. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 8080fea30d..6f7fc1b9a4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1087,6 +1087,8 @@ mt7925_mcu_sta_hdr_trans_tlv(struct sk_buff *skb, struct mt792x_link_sta *mlink; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; wcid = &mlink->wcid; } else { wcid = &mvif->sta.deflink.wcid; @@ -1120,6 +1122,9 @@ int mt7925_mcu_wtbl_update_hdr_trans(struct mt792x_dev *dev, link_sta = mt792x_sta_to_link_sta(vif, sta, link_id); mconf = mt792x_vif_to_link(mvif, link_id); + if (!mlink || !mconf) + return -EINVAL; + skb = __mt76_connac_mcu_alloc_sta_req(&dev->mt76, &mconf->mt76, &mlink->wcid, MT7925_STA_UPDATE_MAX_SIZE); @@ -1751,6 +1756,8 @@ mt7925_mcu_sta_amsdu_tlv(struct sk_buff *skb, amsdu->amsdu_en = true; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mlink->wcid.amsdu = true; switch (link_sta->agg.max_amsdu_len) { @@ -1953,6 +1960,9 @@ mt7925_mcu_sta_mld_tlv(struct sk_buff *skb, mconf = mt792x_vif_to_link(mvif, i); mlink = mt792x_sta_to_link(msta, i); + if (!mconf || !mlink) + continue; + mld->link[cnt].wlan_id = cpu_to_le16(mlink->wcid.idx); mld->link[cnt++].bss_idx = mconf->mt76.idx; @@ -2045,7 +2055,7 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; lockdep_assert_held(&dev->mt76.mutex); @@ -2053,7 +2063,7 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 15/21] wifi: mt76: mt792x: fix firmware reload failure after previous load crash 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (13 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 14/21] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 16/21] wifi: mt76: mt7925: add mutex protection in resume path Zac ` (5 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> If the firmware loading process crashes or is interrupted after acquiring the patch semaphore but before releasing it, subsequent firmware load attempts will fail with 'Failed to get patch semaphore' because the semaphore is still held. This issue manifests as devices becoming unusable after suspend/resume failures or firmware crashes, requiring a full hardware reboot to recover. This has been widely reported on MT7921 and MT7925 devices. Example error log: mt7921e 0000:c2:00.0: Failed to get patch semaphore mt7921e 0000:c2:00.0: probe with driver mt7921e failed with error -5 Apply the same fix that was applied to MT7915 in commit 79dd14f: 1. Release the patch semaphore before starting firmware load (in case it was held by a previous failed attempt) 2. Restart MCU firmware to ensure clean state 3. Wait briefly for MCU to be ready This fix applies to both MT7921 and MT7925 drivers which share the mt792x_load_firmware() function. Fixes: 583204ae70f9 ("wifi: mt76: mt792x: move mt7921_load_firmware in mt792x-lib module") Link: https://github.com/openwrt/mt76/commit/79dd14f2e8161b656341b6653261779199aedbe4 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt792x_core.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index 9dc768aa8b..05598202b4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -936,6 +936,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) { int ret; + /* Release semaphore if taken by previous failed load attempt. + * This prevents "Failed to get patch semaphore" errors when + * recovering from firmware crashes or suspend/resume failures. + */ + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); + if (ret < 0) + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); + + /* Always restart MCU to ensure clean state before loading firmware */ + mt76_connac_mcu_restart(&dev->mt76); + + /* Wait for MCU to be ready after restart */ + msleep(100); + ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); if (ret) return ret; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 16/21] wifi: mt76: mt7925: add mutex protection in resume path 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (14 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 15/21] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 17/21] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac ` (4 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add mutex protection around mt7925_mcu_set_deep_sleep() and mt7925_mcu_regd_update() calls in the resume path to prevent potential race conditions during resume operations. These MCU operations require serialization, and the resume path was the only call site missing mutex protection. Without this, concurrent access during resume could corrupt firmware state or cause race conditions with other MCU commands. Found by static analysis (sparse/coccinelle) pattern matching for unprotected MCU function calls. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index e9d62c6aee..3a9e32a175 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -584,10 +584,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) } /* restore previous ds setting */ + mt792x_mutex_acquire(dev); if (!pm->ds_enable) mt7925_mcu_set_deep_sleep(dev, false); mt7925_mcu_regd_update(dev, mdev->alpha2, dev->country_ie_env); + mt792x_mutex_release(dev); failed: pm->suspended = false; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 17/21] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (15 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 16/21] wifi: mt76: mt7925: add mutex protection in resume path Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 18/21] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac ` (3 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Bowling, Zac Bowling From: Zac Bowling <zbowling@gmail.com> Add NULL pointer checks for mt792x_sta_to_link() and mt792x_vif_to_link() results in critical paths to prevent kernel crashes during MLO operations. Functions fixed: - mt7925_mac_link_sta_add(): Check mlink and mconf before dereferencing - mt7925_conf_tx(): Check mconf before accessing queue_params These can be NULL during MLO link setup/teardown when mac80211 state may not be fully synchronized with driver state. Found through static analysis and pattern matching. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 chips") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7cf6faa1f6..81373e479a 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -871,12 +871,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -1735,6 +1740,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 18/21] wifi: mt76: mt7921: fix missing mutex protection in multiple paths 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (16 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 17/21] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 19/21] wifi: mt76: mt7921: fix mutex deadlocks " Zac ` (2 subsequent siblings) 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Add mt792x_mutex_acquire/release around ieee80211_iterate_*() calls in MT7921 driver to prevent race conditions: - mt7921_roc_abort_sync(): protect ROC abort iteration - mt7921_set_runtime_pm(): protect runtime PM iteration - mt7921_regd_set_6ghz_power_type(): protect 6GHz power type iteration - mt7921_mac_reset_work(): protect vif reconnect iteration after reset These paths were missing the mutex protection that is required when calling ieee80211_iterate_* functions with ITER_RESUME_ALL flag. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/main.c | 9 ++++++++- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c index 03b4960db7..f5c882e45b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c @@ -693,9 +693,11 @@ void mt7921_mac_reset_work(struct work_struct *work) clear_bit(MT76_RESET, &dev->mphy.state); pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 5fae9a6e27..8fc3770d1b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -373,10 +373,13 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); - if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) { + mt792x_mutex_acquire(dev); ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_roc_iter, (void *)phy); + mt792x_mutex_release(dev); + } } EXPORT_SYMBOL_GPL(mt7921_roc_abort_sync); @@ -619,9 +622,11 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable); } @@ -765,9 +770,11 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) struct mt792x_dev *dev = phy->dev; u32 valid_vif_num = 0; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_calc_vif_num, &valid_vif_num); + mt792x_mutex_release(dev); if (valid_vif_num > 1) { phy->power_type = MT_AP_DEFAULT; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 19/21] wifi: mt76: mt7921: fix mutex deadlocks in multiple paths 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (17 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 18/21] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 20/21] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac 2026-01-16 1:05 ` [PATCH v4 21/21] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac Fix mutex handling to prevent deadlocks: - mt7921_roc_abort_sync(): Remove internal mutex acquire/release since this function is called from contexts that already hold the mutex (mt7921_mac_sta_remove via mt76_sta_remove). Add mutex at caller sites that don't hold it (pci.c and sdio.c suspend paths). - mt7921_set_runtime_pm(): Remove internal mutex acquire/release since the only caller (debugfs) already holds the mutex. The previous patches incorrectly added mutex acquire inside functions that can be called from contexts where the mutex is already held, causing deadlocks. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7921/main.c | 13 +++++++------ drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/sdio.c | 2 ++ 3 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 8fc3770d1b..9315dbdf88 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -373,13 +373,15 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); - if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) { - mt792x_mutex_acquire(dev); + /* Note: caller must hold mutex if ieee80211_iterate_interfaces is + * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) + * already hold the mutex via mt76_sta_remove(). For suspend paths, + * the mutex should be acquired before calling this function. + */ + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_roc_iter, (void *)phy); - mt792x_mutex_release(dev); - } } EXPORT_SYMBOL_GPL(mt7921_roc_abort_sync); @@ -622,11 +624,10 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; - mt792x_mutex_acquire(dev); + /* Note: caller (debugfs) must hold mutex before calling this function */ ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev); - mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index ec96861832..9f76b334b9 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 3421e53dc9..92ea281181 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,7 +219,9 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 20/21] wifi: mt76: fix list corruption in mt76_wcid_cleanup 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (18 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 19/21] wifi: mt76: mt7921: fix mutex deadlocks " Zac @ 2026-01-16 1:05 ` Zac 2026-01-16 1:05 ` [PATCH v4 21/21] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac, Zac Bowling mt76_wcid_cleanup() was not removing wcid entries from sta_poll_list before mt76_reset_device() reinitializes the master list. This leaves stale pointers in wcid->poll_list, causing list corruption when mt76_wcid_add_poll() later checks list_empty() and tries to add the entry back. The fix adds proper cleanup of poll_list in mt76_wcid_cleanup(), matching how tx_list is already handled. This is similar to what mt7996_mac_sta_deinit_link() already does correctly. Fixes list corruption warnings like: list_add corruption. prev->next should be next (ffffffff...) Signed-off-by: Zac Bowling <zbowling@gmail.com> --- drivers/net/wireless/mediatek/mt76/mac80211.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index 75772979f4..d0c522909e 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -1716,6 +1716,16 @@ void mt76_wcid_cleanup(struct mt76_dev *dev, struct mt76_wcid *wcid) idr_destroy(&wcid->pktid); + /* Remove from sta_poll_list to prevent list corruption after reset. + * Without this, mt76_reset_device() reinitializes sta_poll_list but + * leaves wcid->poll_list with stale pointers, causing list corruption + * when mt76_wcid_add_poll() checks list_empty(). + */ + spin_lock_bh(&dev->sta_poll_lock); + if (!list_empty(&wcid->poll_list)) + list_del_init(&wcid->poll_list); + spin_unlock_bh(&dev->sta_poll_lock); + spin_lock_bh(&phy->tx_lock); if (!list_empty(&wcid->tx_list)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v4 21/21] wifi: mt76: mt7925: fix BA session teardown during beacon loss 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac ` (19 preceding siblings ...) 2026-01-16 1:05 ` [PATCH v4 20/21] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac @ 2026-01-16 1:05 ` Zac 20 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-16 1:05 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, linux, ryder.lee, sean.wang, Zac, Zac Bowling The ieee80211_stop_tx_ba_cb_irqsafe() callback was conditionally called only when the MCU command succeeded. However, during beacon connection loss, the MCU command may fail because the AP is no longer reachable. If the callback is not called, mac80211's BA session state machine gets stuck in an intermediate state. When mac80211 later tries to tear down all BA sessions during disconnection, it hits a WARN in __ieee80211_stop_tx_ba_session() due to the inconsistent state. Fix by making the callback unconditional, matching the behavior of mt7921 and mt7996 drivers. The MCU command failure is acceptable during disconnection - what matters is that mac80211 is notified to complete the session teardown. Reported-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Zac Bowling <zbowling@gmail.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 81373e479a..cc7ef2c170 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1323,9 +1323,13 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - ret = mt7925_mcu_uni_tx_ba(dev, params, false); - if (!ret) - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + /* MCU command may fail during beacon loss, but callback must + * always be called to complete the BA session teardown in + * mac80211. Otherwise the state machine gets stuck and triggers + * WARN in __ieee80211_stop_tx_ba_session(). + */ + mt7925_mcu_uni_tx_ba(dev, params, false); + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang 2026-01-16 0:43 ` Zac Bowling 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 01/11] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac ` (10 more replies) 2 siblings, 11 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> This series addresses stability issues in the mt7925 (WiFi 7) and mt7921 drivers, focusing on NULL pointer dereferences, mutex protection, MLO (Multi-Link Operation) handling, and ROC (Remain-On-Channel) state machine fixes. Changes since v4: - Reorganized 27 patches into 11 cleaner, logically-grouped patches for easier review. Patches are now ordered by subsystem dependency: mt76 core -> mt792x shared -> mt7921 -> mt7925 - Consolidated ROC-related fixes (previously patches 22-27) into a single comprehensive patch (11/11) that addresses the interconnected deadlock and race condition issues discovered through extended testing - New issues fixed since v4: * ROC deadlock in sta removal path - cancel_work_sync() was waiting for roc_work which needed the mutex already held by sta_remove * ROC timer race during suspend - timer could fire after suspend started but before ROC was properly aborted * Async ROC abort race condition - double-free when async abort raced with normal ROC completion * Added ROC rate limiting with exponential backoff to mitigate MLO authentication failures caused by rapid ROC requests overwhelming the MT7925 firmware * Fixed spurious ieee80211_remain_on_channel_expired() callback when ROC wasn't actually active (found via code review) - Added corresponding mt7921 fixes (patches 3-4) since the older driver shares similar code paths and exhibited the same deadlock patterns - Firmware reload fix (patch 2) addresses crashes when the device needs recovery after a failed firmware load - the semaphore wasn't being released, causing subsequent loads to hang Investigation and Testing: All issues were discovered through real-world testing on Framework 16 laptops with the MT7925 (RZ616) WiFi module. Crash dumps, dmesg logs, and detailed analysis are available in the repository below. A DKMS version with extensive debug logging is available for community testing. This has been instrumental in tracking down the more subtle race conditions and deadlocks that only manifest under specific timing conditions. Repository: https://github.com/zbowling/mt7925 - kernels/ - Pre-built patches for 6.17, 6.18, 6.19-rc, nbd168 - dkms/ - DKMS module with extra debug logging - crashes/ - Crash investigation logs and analysis Acknowledgments: Thank you to the community members who tested the DKMS version and provided crash reports, dmesg dumps, and helped track down the more elusive deadlocks. Your patience and detailed bug reports made these fixes possible. Tested on MT7925 (RZ616) with kernels 6.17.13, 6.18.5, and 6.19-rc5. Zac Bowling (11): wifi: mt76: fix list corruption in mt76_wcid_cleanup wifi: mt76: mt792x: fix NULL pointer and firmware reload issues wifi: mt76: mt7921: add mutex protection in critical paths wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO wifi: mt76: mt7925: add mutex protection in critical paths wifi: mt76: mt7925: add MCU command error handling wifi: mt76: mt7925: add lockdep assertions for mutex verification wifi: mt76: mt7925: fix MLO roaming and ROC setup issues wifi: mt76: mt7925: fix BA session teardown during beacon loss wifi: mt76: mt7925: fix ROC deadlocks and race conditions drivers/net/wireless/mediatek/mt76/mac80211.c | 8 + drivers/net/wireless/mediatek/mt76/mt76.h | 1 + drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 + drivers/net/wireless/mediatek/mt76/mt7921/main.c | 37 ++- drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 - drivers/net/wireless/mediatek/mt76/mt7921/sdio.c | 2 - drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 8 + drivers/net/wireless/mediatek/mt76/mt7925/main.c | 257 +++++++++++++-- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 46 ++- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 4 + drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + drivers/net/wireless/mediatek/mt76/mt792x_core.c | 17 +- 12 files changed, 340 insertions(+), 51 deletions(-) -- 2.52.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 01/11] wifi: mt76: fix list corruption in mt76_wcid_cleanup 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac ` (9 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> mt76_wcid_cleanup() was not removing wcid entries from sta_poll_list before mt76_reset_device() reinitializes the master list. This leaves stale pointers in wcid->poll_list, causing list corruption when mt76_wcid_add_poll() later checks list_empty() and tries to add the entry back. The fix adds proper cleanup of poll_list in mt76_wcid_cleanup(), matching how tx_list is already handled. This is similar to what mt7996_mac_sta_deinit_link() already does correctly. Fixes list corruption warnings like: list_add corruption. prev->next should be next (ffffffff...) Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mac80211.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index 75772979f438..d0c522909e98 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -1716,6 +1716,16 @@ void mt76_wcid_cleanup(struct mt76_dev *dev, struct mt76_wcid *wcid) idr_destroy(&wcid->pktid); + /* Remove from sta_poll_list to prevent list corruption after reset. + * Without this, mt76_reset_device() reinitializes sta_poll_list but + * leaves wcid->poll_list with stale pointers, causing list corruption + * when mt76_wcid_add_poll() checks list_empty(). + */ + spin_lock_bh(&dev->sta_poll_lock); + if (!list_empty(&wcid->poll_list)) + list_del_init(&wcid->poll_list); + spin_unlock_bh(&dev->sta_poll_lock); + spin_lock_bh(&phy->tx_lock); if (!list_empty(&wcid->tx_list)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac 2026-01-20 6:28 ` [PATCH 01/11] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 7:04 ` Greg KH 2026-01-20 6:28 ` [PATCH 03/11] wifi: mt76: mt7921: add mutex protection in critical paths Zac ` (8 subsequent siblings) 10 siblings, 1 reply; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> This patch combines two fixes for the shared mt792x code used by both MT7921 and MT7925 drivers: 1. Fix NULL pointer dereference in TX path: Add NULL pointer checks in mt792x_tx() to prevent kernel crashes when transmitting packets during MLO link removal. The function calls mt792x_sta_to_link() which can return NULL if the link is being removed, but the return value was dereferenced without checking. Similarly, the RCU-protected link_conf and link_sta pointers were used without NULL validation. This race can occur when: - A packet is queued for transmission - Concurrently, the link is being removed (mt7925_mac_link_sta_remove) - mt792x_sta_to_link() returns NULL for the removed link - Kernel crashes on wcid = &mlink->wcid dereference Fix by checking mlink, conf, and link_sta before use, freeing the SKB and returning early if any pointer is NULL. 2. Fix firmware reload failure after previous load crash: If the firmware loading process crashes or is interrupted after acquiring the patch semaphore but before releasing it, subsequent firmware load attempts will fail with 'Failed to get patch semaphore'. Apply the same fix from MT7915 (commit 79dd14f): release the patch semaphore before starting firmware load and restart MCU firmware to ensure clean state. Fixes: c74df1c067f2 ("wifi: mt76: mt792x: introduce mt792x-lib module") Fixes: 583204ae70f9 ("wifi: mt76: mt792x: move mt7921_load_firmware in mt792x-lib module") Link: https://github.com/openwrt/mt76/commit/79dd14f2e8161b656341b6653261779199aedbe4 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index f2ed16feb6c1..05598202b488 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, IEEE80211_TX_CTRL_MLO_LINK); sta = (struct mt792x_sta *)control->sta->drv_priv; mlink = mt792x_sta_to_link(sta, link_id); + if (!mlink) + goto free_skb; wcid = &mlink->wcid; } @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, link_id = wcid->link_id; rcu_read_lock(); conf = rcu_dereference(vif->link_conf[link_id]); - memcpy(hdr->addr2, conf->addr, ETH_ALEN); - link_sta = rcu_dereference(control->sta->link[link_id]); + if (!conf || !link_sta) { + rcu_read_unlock(); + goto free_skb; + } + memcpy(hdr->addr2, conf->addr, ETH_ALEN); memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); if (vif->type == NL80211_IFTYPE_STATION) @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, } mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); + return; + +free_skb: + ieee80211_free_txskb(hw, skb); } EXPORT_SYMBOL_GPL(mt792x_tx); @@ -927,6 +936,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) { int ret; + /* Release semaphore if taken by previous failed load attempt. + * This prevents "Failed to get patch semaphore" errors when + * recovering from firmware crashes or suspend/resume failures. + */ + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); + if (ret < 0) + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); + + /* Always restart MCU to ensure clean state before loading firmware */ + mt76_connac_mcu_restart(&dev->mt76); + + /* Wait for MCU to be ready after restart */ + msleep(100); + ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); if (ret) return ret; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues 2026-01-20 6:28 ` [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac @ 2026-01-20 7:04 ` Greg KH 0 siblings, 0 replies; 113+ messages in thread From: Greg KH @ 2026-01-20 7:04 UTC (permalink / raw) To: Zac Cc: sean.wang, deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling On Mon, Jan 19, 2026 at 10:28:45PM -0800, Zac wrote: > From: Zac Bowling <zac@zacbowling.com> > > This patch combines two fixes for the shared mt792x code used by both > MT7921 and MT7925 drivers: > > 1. Fix NULL pointer dereference in TX path: > > Add NULL pointer checks in mt792x_tx() to prevent kernel crashes when > transmitting packets during MLO link removal. > > The function calls mt792x_sta_to_link() which can return NULL if the > link is being removed, but the return value was dereferenced without > checking. Similarly, the RCU-protected link_conf and link_sta pointers > were used without NULL validation. > > This race can occur when: > - A packet is queued for transmission > - Concurrently, the link is being removed (mt7925_mac_link_sta_remove) > - mt792x_sta_to_link() returns NULL for the removed link > - Kernel crashes on wcid = &mlink->wcid dereference > > Fix by checking mlink, conf, and link_sta before use, freeing the SKB > and returning early if any pointer is NULL. > > 2. Fix firmware reload failure after previous load crash: > > If the firmware loading process crashes or is interrupted after > acquiring the patch semaphore but before releasing it, subsequent > firmware load attempts will fail with 'Failed to get patch semaphore'. > > Apply the same fix from MT7915 (commit 79dd14f): release the patch > semaphore before starting firmware load and restart MCU firmware to > ensure clean state. > > Fixes: c74df1c067f2 ("wifi: mt76: mt792x: introduce mt792x-lib module") > Fixes: 583204ae70f9 ("wifi: mt76: mt792x: move mt7921_load_firmware in mt792x-lib module") > Link: https://github.com/openwrt/mt76/commit/79dd14f2e8161b656341b6653261779199aedbe4 > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > .../net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++++-- > 1 file changed, 25 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > index f2ed16feb6c1..05598202b488 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c > +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c > @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > IEEE80211_TX_CTRL_MLO_LINK); > sta = (struct mt792x_sta *)control->sta->drv_priv; > mlink = mt792x_sta_to_link(sta, link_id); > + if (!mlink) > + goto free_skb; > wcid = &mlink->wcid; > } > > @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > link_id = wcid->link_id; > rcu_read_lock(); > conf = rcu_dereference(vif->link_conf[link_id]); > - memcpy(hdr->addr2, conf->addr, ETH_ALEN); > - > link_sta = rcu_dereference(control->sta->link[link_id]); > + if (!conf || !link_sta) { > + rcu_read_unlock(); > + goto free_skb; > + } > + memcpy(hdr->addr2, conf->addr, ETH_ALEN); > memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); > > if (vif->type == NL80211_IFTYPE_STATION) > @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, > } > > mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); > + return; > + > +free_skb: > + ieee80211_free_txskb(hw, skb); > } > EXPORT_SYMBOL_GPL(mt792x_tx); > > @@ -927,6 +936,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) > { > int ret; > > + /* Release semaphore if taken by previous failed load attempt. > + * This prevents "Failed to get patch semaphore" errors when > + * recovering from firmware crashes or suspend/resume failures. > + */ > + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); > + if (ret < 0) > + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); > + > + /* Always restart MCU to ensure clean state before loading firmware */ > + mt76_connac_mcu_restart(&dev->mt76); > + > + /* Wait for MCU to be ready after restart */ > + msleep(100); > + > ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); > if (ret) > return ret; > -- > 2.52.0 > <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter> ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 03/11] wifi: mt76: mt7921: add mutex protection in critical paths 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac 2026-01-20 6:28 ` [PATCH 01/11] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac 2026-01-20 6:28 ` [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 04/11] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac ` (7 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Add proper mutex protection for mt7921 driver operations that access hardware state without proper synchronization. This fixes multiple race conditions that can cause system instability. Fixes added: 1. mac.c: mt7921_mac_reset_work() - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex - The vif_connect_iter callback accesses hw_encap state 2. main.c: mt7921_remain_on_channel() - Remove mt792x_mutex_acquire/release around mt7925_set_channel_state() - The function is already called with mutex held from mac80211 - This was causing double-lock deadlock 3. main.c: mt7921_cancel_remain_on_channel() - Remove mt792x_mutex_acquire/release - Function is called from mac80211 with mutex already held 4. pci.c: mt7921_pci_pm_complete() - Remove mt792x_mutex_acquire/release around ieee80211_iterate_active_interfaces - This was causing deadlock as the vif connect iteration tries to acquire the mutex again 5. usb.c: mt7921_usb_pm_complete() - Same fix as pci.c for USB driver path These changes prevent both missing mutex protection and mutex deadlocks in the mt7921 driver. Fixes: 5c14a5f944b9 ("wifi: mt76: mt7921: introduce remain_on_channel support") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/main.c | 8 ++++++++ drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/sdio.c | 2 ++ 4 files changed, 14 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c index 03b4960db73f..f5c882e45bbe 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c @@ -693,9 +693,11 @@ void mt7921_mac_reset_work(struct work_struct *work) clear_bit(MT76_RESET, &dev->mphy.state); pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 5fae9a6e273c..9315dbdf8880 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -373,6 +373,11 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); + /* Note: caller must hold mutex if ieee80211_iterate_interfaces is + * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) + * already hold the mutex via mt76_sta_remove(). For suspend paths, + * the mutex should be acquired before calling this function. + */ if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, @@ -619,6 +624,7 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + /* Note: caller (debugfs) must hold mutex before calling this function */ ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev); @@ -765,9 +771,11 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) struct mt792x_dev *dev = phy->dev; u32 valid_vif_num = 0; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_calc_vif_num, &valid_vif_num); + mt792x_mutex_release(dev); if (valid_vif_num > 1) { phy->power_type = MT_AP_DEFAULT; diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index ec9686183251..9f76b334b93d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 3421e53dc948..92ea2811816f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,7 +219,9 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 04/11] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (2 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 03/11] wifi: mt76: mt7921: add mutex protection in critical paths Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 05/11] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac ` (6 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Fix deadlock scenarios in mt7921 ROC (Remain On Channel) abort paths: 1. Suspend path deadlock (pci.c, sdio.c): - Previous fix (b74d48c46f) added mutex around mt7921_roc_abort_sync - But roc_work acquires mutex, so cancel_work_sync can deadlock - Fix: Remove mutex wrappers since mt7921_roc_abort_sync doesn't actually need them (it only calls timer_delete_sync, cancel_work_sync, and ieee80211_iterate_interfaces which handles its own locking) 2. sta_remove path deadlock: - mt7921_mac_sta_remove is called from mt76_sta_remove which holds mutex - Calling mt7921_roc_abort_sync → cancel_work_sync can deadlock if roc_work is waiting for the mutex - Fix: Add mt7921_roc_abort_async (matching mt7925 pattern) that sets abort flag and schedules work instead of blocking - Add abort flag checking in mt7921_roc_work to handle async abort The fix mirrors the mt7925 implementation which already handles these scenarios correctly. Fixes: b74d48c46f ("wifi: mt76: mt7921: fix mutex handling in multiple paths") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7921/main.c | 29 +++++++++++++++---- .../net/wireless/mediatek/mt76/mt7921/pci.c | 2 -- .../net/wireless/mediatek/mt76/mt7921/sdio.c | 2 -- 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 9315dbdf8880..07d1d0d497f1 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -367,17 +367,24 @@ static void mt7921_roc_iter(void *priv, u8 *mac, mt7921_mcu_abort_roc(phy, mvif, phy->roc_token_id); } +/* Async ROC abort - safe to call while holding mutex. + * Sets abort flag and schedules roc_work for cleanup. + */ +static void mt7921_roc_abort_async(struct mt792x_dev *dev) +{ + struct mt792x_phy *phy = &dev->phy; + + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + timer_delete(&phy->roc_timer); + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); +} + void mt7921_roc_abort_sync(struct mt792x_dev *dev) { struct mt792x_phy *phy = &dev->phy; timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); - /* Note: caller must hold mutex if ieee80211_iterate_interfaces is - * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) - * already hold the mutex via mt76_sta_remove(). For suspend paths, - * the mutex should be acquired before calling this function. - */ if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, @@ -392,6 +399,15 @@ void mt7921_roc_work(struct work_struct *work) phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, roc_work); + /* Check abort flag before acquiring mutex to prevent deadlock. + * Only send expired callback if ROC was actually active. + */ + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + ieee80211_remain_on_channel_expired(phy->mt76->hw); + return; + } + if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) return; @@ -887,7 +903,8 @@ void mt7921_mac_sta_remove(struct mt76_dev *mdev, struct ieee80211_vif *vif, struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_sta *msta = (struct mt792x_sta *)sta->drv_priv; - mt7921_roc_abort_sync(dev); + /* Async abort - caller already holds mutex */ + mt7921_roc_abort_async(dev); mt76_connac_free_pending_tx_skbs(&dev->pm, &msta->deflink.wcid); mt76_connac_pm_wake(&dev->mphy, &dev->pm); diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index 9f76b334b93d..ec9686183251 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,9 +426,7 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); - mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); - mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 92ea2811816f..3421e53dc948 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,9 +219,7 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); - mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); - mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 05/11] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (3 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 04/11] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 06/11] wifi: mt76: mt7925: add mutex protection in critical paths Zac ` (5 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks for functions that return pointers to link-related structures throughout the mt7925 driver. During MLO state transitions, these functions can return NULL when link configuration is not synchronized. Functions protected: - mt792x_vif_to_bss_conf(): Returns link BSS configuration - mt792x_vif_to_link(): Returns driver link state - mt792x_sta_to_link(): Returns station link state Files updated: 1. mac.c: - mt7925_vif_connect_iter(): Check bss_conf before use - mt7925_mac_sta_assoc(): Check bss_conf before use 2. main.c: - mt7925_set_key(): Check link_conf and mlink - mt7925_mac_link_sta_add(): Check link_conf and mlink - mt7925_mac_link_sta_assoc(): Check bss_conf and mlink - mt7925_mac_link_sta_remove(): Check bss_conf and mlink - mt7925_change_vif_links(): Check conf before use - mt7925_assign_vif_chanctx(): Check mconf and mlink - mt7925_unassign_vif_chanctx(): Check mconf and mlink - mt7925_mgd_prepare_tx(): Check link_conf 3. mcu.c: - mt7925_mcu_sta_phy_tlv(): Check link_sta - mt7925_mcu_sta_amsdu_tlv(): Check link_sta - mt7925_mcu_sta_mld_tlv(): Check link_sta - mt7925_mcu_sta_cmd(): Check mlink - mt7925_mcu_add_bss_info(): Check link_conf - mt7925_mcu_set_chctx(): Check link_conf and mlink Prevents crashes during: - BSSID roaming transitions - MLO setup and teardown - Hardware reset operations - Runtime power management Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/mac.c | 6 ++ .../net/wireless/mediatek/mt76/mt7925/main.c | 82 ++++++++++++++++--- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 22 ++++- 3 files changed, 97 insertions(+), 13 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 871b67101976..184efe8afa10 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 2d358a96640c..15d1b1b8d9f8 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -604,6 +604,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) + return -EINVAL; + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -856,12 +860,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -887,6 +896,8 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -993,18 +1004,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; + + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; + + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1032,6 +1054,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1041,12 +1065,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1093,6 +1118,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1106,10 +1133,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1117,6 +1146,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -1304,6 +1334,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } mt792x_mutex_release(dev); @@ -1630,6 +1662,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1691,6 +1725,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; @@ -1861,6 +1898,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1876,6 +1915,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } @@ -1897,7 +1938,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2021,6 +2067,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); @@ -2101,9 +2152,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2132,6 +2188,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index cf0fdea45cf7..94ec62a4538a 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1087,6 +1087,8 @@ mt7925_mcu_sta_hdr_trans_tlv(struct sk_buff *skb, struct mt792x_link_sta *mlink; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; wcid = &mlink->wcid; } else { wcid = &mvif->sta.deflink.wcid; @@ -1120,6 +1122,9 @@ int mt7925_mcu_wtbl_update_hdr_trans(struct mt792x_dev *dev, link_sta = mt792x_sta_to_link_sta(vif, sta, link_id); mconf = mt792x_vif_to_link(mvif, link_id); + if (!mlink || !mconf) + return -EINVAL; + skb = __mt76_connac_mcu_alloc_sta_req(&dev->mt76, &mconf->mt76, &mlink->wcid, MT7925_STA_UPDATE_MAX_SIZE); @@ -1741,6 +1746,8 @@ mt7925_mcu_sta_amsdu_tlv(struct sk_buff *skb, amsdu->amsdu_en = true; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mlink->wcid.amsdu = true; switch (link_sta->agg.max_amsdu_len) { @@ -1773,6 +1780,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1851,6 +1862,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; @@ -1935,6 +1950,9 @@ mt7925_mcu_sta_mld_tlv(struct sk_buff *skb, mconf = mt792x_vif_to_link(mvif, i); mlink = mt792x_sta_to_link(msta, i); + if (!mconf || !mlink) + continue; + mld->link[cnt].wlan_id = cpu_to_le16(mlink->wcid.idx); mld->link[cnt++].bss_idx = mconf->mt76.idx; @@ -2027,13 +2045,13 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 06/11] wifi: mt76: mt7925: add mutex protection in critical paths 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (4 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 05/11] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 07/11] wifi: mt76: mt7925: add MCU command error handling Zac ` (4 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Add proper mutex protection for mt7925 driver operations that access hardware state without proper synchronization. This fixes race conditions that can cause system instability during power management and recovery. Fixes added: 1. mac.c: mt7925_mac_reset_work() - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex - The vif_connect_iter callback accesses hardware state 2. mac.c: mt7925_mac_sta_assoc() - Wrap vif_connect_iter call with mutex protection - Called during station association which races with PM 3. main.c: mt7925_set_runtime_pm() - Add mutex protection around mt76_connac_pm_wake/sleep - Runtime PM can race with other operations 4. main.c: mt7925_set_mlo_pm() - Add mutex protection around MLO PM configuration - Prevents races during MLO link setup/teardown 5. pci.c: mt7925_pci_resume() - Add mutex protection around ieee80211_iterate_active_interfaces - The vif iteration accesses hardware state that needs synchronization These protections ensure consistent hardware state access during power management transitions and recovery operations. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 6 ++++-- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 4 ++++ 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 184efe8afa10..06420ac6ed55 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1331,9 +1331,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 15d1b1b8d9f8..80ca5181150b 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -755,9 +755,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } @@ -1331,14 +1333,12 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) if (mvif->mlo_pm_state != MT792x_MLO_CHANGED_PS) return; - mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); if (!bss_conf) continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } - mt792x_mutex_release(dev); } void mt7925_mlo_pm_work(struct work_struct *work) @@ -1347,9 +1347,11 @@ void mt7925_mlo_pm_work(struct work_struct *work) mlo_pm_work.work); struct ieee80211_hw *hw = mt76_hw(dev); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_mlo_pm_iter, dev); + mt792x_mutex_release(dev); } void mt7925_scan_work(struct work_struct *work) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c index c4161754c01d..3a9e32a1759d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/pci.c @@ -455,7 +455,9 @@ static int mt7925_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7925_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) @@ -582,10 +584,12 @@ static int _mt7925_pci_resume(struct device *device, bool restore) } /* restore previous ds setting */ + mt792x_mutex_acquire(dev); if (!pm->ds_enable) mt7925_mcu_set_deep_sleep(dev, false); mt7925_mcu_regd_update(dev, mdev->alpha2, dev->country_ie_env); + mt792x_mutex_release(dev); failed: pm->suspended = false; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 07/11] wifi: mt76: mt7925: add MCU command error handling 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (5 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 06/11] wifi: mt76: mt7925: add mutex protection in critical paths Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 08/11] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac ` (3 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Add proper error handling for MCU command return values that were previously being ignored. Without proper error handling, failures in MCU communication can leave the driver in an inconsistent state. Functions updated: 1. main.c: mt7925_ampdu_action() - BA session setup - Check mt7925_mcu_uni_tx_ba() return value - Check mt7925_mcu_uni_rx_ba() return value - Return error to mac80211 on failure 2. main.c: mt7925_mac_link_sta_add() - Station addition - Check mt7925_mcu_add_bss_info() return value - Propagate errors during station setup 3. main.c: mt7925_set_key() - Key installation - Check mt7925_mcu_add_bss_info() return value when setting BSS info before key installation - Prevent key setup on communication failure These changes ensure that MCU communication failures are properly detected and reported to mac80211, allowing proper error recovery instead of leaving the driver in an undefined state. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 30 +++++++++++-------- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 80ca5181150b..5f8a28d5ff72 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -637,8 +637,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) @@ -904,11 +906,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && @@ -1287,22 +1292,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1311,8 +1316,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 08/11] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (6 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 07/11] wifi: mt76: mt7925: add MCU command error handling Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 09/11] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac ` (2 subsequent siblings) 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. The lockdep assertions will trigger warnings at runtime if code paths exist that call these functions without proper mutex protection. This aids in detecting the class of bugs fixed by patches in this series. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 94ec62a4538a..1c58b0be2be4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1532,6 +1532,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2047,6 +2049,8 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, struct mt792x_sta *msta; struct mt792x_link_sta *mlink = NULL; + lockdep_assert_held(&dev->mt76.mutex); + if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); @@ -2853,6 +2857,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 09/11] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (7 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 08/11] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 10/11] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Fix two issues related to MLO roaming and remain-on-channel operations: 1. Key removal failure during MLO roaming: During MLO roaming, key removal can fail because the WCID (wireless client ID) is already cleaned up before the key removal operation completes. When roaming between APs in an MLO setup: - mac80211 triggers sta_state changes - mt7925_mac_link_sta_remove() is called for the old link - WCID is cleared via mt76_wcid_cleanup() - Later, key removal MCU command uses the now-invalid WCID Fix by checking if the WCID is still valid before sending key removal commands to firmware. If the WCID has already been cleaned up, skip the MCU command since the firmware has already removed the keys. 2. Kernel warning in MLO ROC setup: When starting a remain-on-channel operation in MLO mode, the driver passes incorrect parameters to mt7925_mcu_set_roc(), causing a kernel warning about invalid chanctx usage. Fix by checking for valid chanctx and link configuration before setting up ROC, and use the correct link_id from the vif when available. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 9 ++++++++- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 20 +++++++++++++------ 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 5f8a28d5ff72..81373e479abd 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -605,8 +605,15 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); - if (!link_conf || !mconf || !mlink) + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; return -EINVAL; + } wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 1c58b0be2be4..6f7fc1b9a440 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1342,15 +1342,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1364,8 +1372,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 10/11] wifi: mt76: mt7925: fix BA session teardown during beacon loss 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (8 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 09/11] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 10 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> The ieee80211_stop_tx_ba_cb_irqsafe() callback was conditionally called only when the MCU command succeeded. However, during beacon connection loss, the MCU command may fail because the AP is no longer reachable. If the callback is not called, mac80211's BA session state machine gets stuck in an intermediate state. When mac80211 later tries to tear down all BA sessions during disconnection, it hits a WARN in __ieee80211_stop_tx_ba_session() due to the inconsistent state. Fix by making the callback unconditional, matching the behavior of mt7921 and mt7996 drivers. The MCU command failure is acceptable during disconnection - what matters is that mac80211 is notified to complete the session teardown. Reported-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 81373e479abd..cc7ef2c17032 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1323,9 +1323,13 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - ret = mt7925_mcu_uni_tx_ba(dev, params, false); - if (!ret) - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + /* MCU command may fail during beacon loss, but callback must + * always be called to complete the BA session teardown in + * mac80211. Otherwise the state machine gets stuck and triggers + * WARN in __ieee80211_stop_tx_ba_session(). + */ + mt7925_mcu_uni_tx_ba(dev, params, false); + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac ` (9 preceding siblings ...) 2026-01-20 6:28 ` [PATCH 10/11] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac @ 2026-01-20 6:28 ` Zac 2026-01-20 8:25 ` Sean Wang ` (2 more replies) 10 siblings, 3 replies; 113+ messages in thread From: Zac @ 2026-01-20 6:28 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling From: Zac Bowling <zac@zacbowling.com> Fix multiple interrelated issues in the remain-on-channel (ROC) handling that cause deadlocks, race conditions, and resource leaks. Problems fixed: 1. Deadlock in sta removal ROC abort path: When a station is removed while a ROC operation is in progress, the driver would call mt7925_roc_abort_sync() which waits for ROC completion. However, the ROC work itself needs to acquire mt792x_mutex which is already held during station removal, causing a deadlock. Fix: Use async ROC abort (mt76_connac_mcu_abort_roc) when called from paths that already hold the mutex, and add MT76_STATE_ROC_ABORT flag to coordinate between the abort and the ROC timer. 2. ROC timer race during suspend: The ROC timer could fire after the device started suspending but before the ROC was properly aborted, causing undefined behavior. Fix: Delete ROC timer synchronously before suspend and check device state before processing ROC timeout. 3. ROC rate limiting for MLO auth failures: Rapid ROC requests during MLO authentication can overwhelm the firmware, causing authentication timeouts. The MT7925 firmware has limited ROC handling capacity. Fix: Add rate limiting infrastructure with configurable minimum interval between ROC requests. Track last ROC completion time and defer new requests if they arrive too quickly. 4. WCID leak in ROC cleanup: When ROC operations are aborted, the associated WCID resources were not being properly released, causing resource exhaustion over time. Fix: Ensure WCID cleanup happens in all ROC termination paths. 5. Async ROC abort race condition: The async ROC abort could race with normal ROC completion, causing double-free or use-after-free of ROC resources. Fix: Use MT76_STATE_ROC_ABORT flag and proper synchronization to prevent races between async abort and normal completion paths. These fixes work together to provide robust ROC handling that doesn't deadlock, properly releases resources, and handles edge cases during suspend and MLO operations. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt76.h | 1 + .../net/wireless/mediatek/mt76/mt7925/main.c | 175 ++++++++++++++++-- drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + 3 files changed, 170 insertions(+), 13 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h index d05e83ea1cac..91f9dd95c89e 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76.h +++ b/drivers/net/wireless/mediatek/mt76/mt76.h @@ -511,6 +511,7 @@ enum { MT76_STATE_POWER_OFF, MT76_STATE_SUSPEND, MT76_STATE_ROC, + MT76_STATE_ROC_ABORT, MT76_STATE_PM, MT76_STATE_WED_RESET, }; diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index cc7ef2c17032..2404f7812897 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -453,6 +453,24 @@ static void mt7925_roc_iter(void *priv, u8 *mac, mt7925_mcu_abort_roc(phy, &mvif->bss_conf, phy->roc_token_id); } +/* Async ROC abort - safe to call while holding mutex. + * Sets abort flag and lets roc_work handle cleanup without blocking. + * This prevents deadlock when called from sta_remove path which holds mutex. + */ +static void mt7925_roc_abort_async(struct mt792x_dev *dev) +{ + struct mt792x_phy *phy = &dev->phy; + + /* Set abort flag - roc_work checks this before acquiring mutex */ + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + + /* Stop timer and schedule work to handle cleanup. + * Must schedule work since timer may not have fired yet. + */ + timer_delete(&phy->roc_timer); + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); +} + void mt7925_roc_abort_sync(struct mt792x_dev *dev) { struct mt792x_phy *phy = &dev->phy; @@ -473,6 +491,17 @@ void mt7925_roc_work(struct work_struct *work) phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, roc_work); + /* Check abort flag BEFORE acquiring mutex to prevent deadlock. + * If abort is requested while we're in the sta_remove path (which + * holds the mutex), we must not try to acquire it or we'll deadlock. + * Clear the flags and only notify mac80211 if ROC was actually active. + */ + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + ieee80211_remain_on_channel_expired(phy->mt76->hw); + return; + } + if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) return; @@ -500,14 +529,93 @@ static int mt7925_abort_roc(struct mt792x_phy *phy, return err; } +/* ROC rate limiting constants - exponential backoff to prevent MCU overload + * when upper layers trigger rapid reconnection cycles (e.g., MLO auth failures). + * Max backoff ~1.6s, resets after 10s of no timeouts. + */ +#define MT7925_ROC_BACKOFF_BASE_MS 100 +#define MT7925_ROC_BACKOFF_MAX_MS 1600 +#define MT7925_ROC_TIMEOUT_RESET_MS 10000 +#define MT7925_ROC_TIMEOUT_WARN_THRESH 5 + +/* Check if ROC should be throttled due to recent timeouts. + * Returns delay in jiffies if throttling, 0 if OK to proceed. + */ +static unsigned long mt7925_roc_throttle_check(struct mt792x_phy *phy) +{ + unsigned long now = jiffies; + + /* Reset timeout counter if it's been a while since last timeout */ + if (phy->roc_timeout_count && + time_after(now, phy->roc_last_timeout + + msecs_to_jiffies(MT7925_ROC_TIMEOUT_RESET_MS))) { + phy->roc_timeout_count = 0; + phy->roc_backoff_until = 0; + } + + /* Check if we're still in backoff period */ + if (phy->roc_backoff_until && time_before(now, phy->roc_backoff_until)) + return phy->roc_backoff_until - now; + + return 0; +} + +/* Record ROC timeout and calculate backoff period */ +static void mt7925_roc_record_timeout(struct mt792x_phy *phy) +{ + unsigned int backoff_ms; + + phy->roc_last_timeout = jiffies; + phy->roc_timeout_count++; + + /* Exponential backoff: 100ms, 200ms, 400ms, 800ms, 1600ms (capped) */ + backoff_ms = MT7925_ROC_BACKOFF_BASE_MS << + min_t(u8, phy->roc_timeout_count - 1, 4); + if (backoff_ms > MT7925_ROC_BACKOFF_MAX_MS) + backoff_ms = MT7925_ROC_BACKOFF_MAX_MS; + + phy->roc_backoff_until = jiffies + msecs_to_jiffies(backoff_ms); + + /* Warn if we're seeing repeated timeouts - likely upper layer issue */ + if (phy->roc_timeout_count == MT7925_ROC_TIMEOUT_WARN_THRESH) + dev_warn(phy->dev->mt76.dev, + "mt7925: %u consecutive ROC timeouts, possible mac80211/wpa_supplicant issue (MLO key race?)\n", + phy->roc_timeout_count); +} + +/* Clear timeout tracking on successful ROC */ +static void mt7925_roc_clear_timeout(struct mt792x_phy *phy) +{ + phy->roc_timeout_count = 0; + phy->roc_backoff_until = 0; +} + static int mt7925_set_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf, struct ieee80211_channel *chan, int duration, enum mt7925_roc_req type) { + unsigned long throttle; int err; + /* Check rate limiting - if in backoff period, wait or return busy */ + throttle = mt7925_roc_throttle_check(phy); + if (throttle) { + /* For short backoffs, wait; for longer ones, return busy */ + if (throttle < msecs_to_jiffies(200)) { + msleep(jiffies_to_msecs(throttle)); + } else { + dev_dbg(phy->dev->mt76.dev, + "mt7925: ROC throttled, %lu ms remaining\n", + jiffies_to_msecs(throttle)); + return -EBUSY; + } + } + + /* Clear stale abort flag from previous ROC */ + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) return -EBUSY; @@ -523,7 +631,11 @@ static int mt7925_set_roc(struct mt792x_phy *phy, if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); clear_bit(MT76_STATE_ROC, &phy->mt76->state); + mt7925_roc_record_timeout(phy); err = -ETIMEDOUT; + } else { + /* Successful ROC - reset timeout tracking */ + mt7925_roc_clear_timeout(phy); } out: @@ -534,8 +646,27 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf, u16 sel_links) { + unsigned long throttle; int err; + /* Check rate limiting - MLO ROC is especially prone to rapid-fire + * during reconnection cycles after MLO authentication failures. + */ + throttle = mt7925_roc_throttle_check(phy); + if (throttle) { + if (throttle < msecs_to_jiffies(200)) { + msleep(jiffies_to_msecs(throttle)); + } else { + dev_dbg(phy->dev->mt76.dev, + "mt7925: MLO ROC throttled, %lu ms remaining\n", + jiffies_to_msecs(throttle)); + return -EBUSY; + } + } + + /* Clear stale abort flag from previous ROC */ + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + if (WARN_ON_ONCE(test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state))) return -EBUSY; @@ -550,7 +681,10 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); clear_bit(MT76_STATE_ROC, &phy->mt76->state); + mt7925_roc_record_timeout(phy); err = -ETIMEDOUT; + } else { + mt7925_roc_clear_timeout(phy); } out: @@ -567,6 +701,7 @@ static int mt7925_remain_on_channel(struct ieee80211_hw *hw, struct mt792x_phy *phy = mt792x_hw_phy(hw); int err; + cancel_work_sync(&phy->roc_work); mt792x_mutex_acquire(phy->dev); err = mt7925_set_roc(phy, &mvif->bss_conf, chan, duration, MT7925_ROC_REQ_ROC); @@ -874,14 +1009,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, if (!mlink) return -EINVAL; - idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); - if (idx < 0) - return -ENOSPC; - mconf = mt792x_vif_to_link(mvif, link_id); if (!mconf) return -EINVAL; + idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); + if (idx < 0) + return -ENOSPC; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -901,14 +1036,16 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); if (ret) - return ret; + goto err_wcid; mt7925_mac_wtbl_update(dev, idx, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (!link_conf) - return -EINVAL; + if (!link_conf) { + ret = -EINVAL; + goto err_wcid; + } /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -920,7 +1057,7 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); if (ret) - return ret; + goto err_wcid; } if (ieee80211_vif_is_mld(vif) && @@ -928,28 +1065,34 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } else if (ieee80211_vif_is_mld(vif) && link_sta != mlink->pri_link) { ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; } else { ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } mt76_connac_power_save_sched(&dev->mphy, &dev->pm); return 0; + +err_wcid: + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); + return ret; } static int @@ -1135,7 +1278,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, if (!mlink) return; - mt7925_roc_abort_sync(dev); + /* Async abort - caller already holds mutex */ + mt7925_roc_abort_async(dev); mt76_connac_free_pending_tx_skbs(&dev->pm, &mlink->wcid); mt76_connac_pm_wake(&dev->mphy, &dev->pm); @@ -1530,6 +1674,8 @@ static int mt7925_suspend(struct ieee80211_hw *hw, cancel_delayed_work_sync(&dev->pm.ps_work); mt76_connac_free_pending_tx_skbs(&dev->pm, NULL); + /* Cancel ROC before quiescing starts */ + mt7925_roc_abort_sync(dev); mt792x_mutex_acquire(dev); clear_bit(MT76_STATE_RUNNING, &phy->mt76->state); @@ -1876,6 +2022,8 @@ static void mt7925_mgd_prepare_tx(struct ieee80211_hw *hw, u16 duration = info->duration ? info->duration : jiffies_to_msecs(HZ); + cancel_work_sync(&mvif->phy->roc_work); + mt792x_mutex_acquire(dev); mt7925_set_roc(mvif->phy, &mvif->bss_conf, mvif->bss_conf.mt76.ctx->def.chan, duration, @@ -2033,6 +2181,7 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, if (old_links == new_links) return 0; + cancel_work_sync(&phy->roc_work); mt792x_mutex_acquire(dev); for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { diff --git a/drivers/net/wireless/mediatek/mt76/mt792x.h b/drivers/net/wireless/mediatek/mt76/mt792x.h index 8388638ed550..d9c1ea709390 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x.h +++ b/drivers/net/wireless/mediatek/mt76/mt792x.h @@ -186,6 +186,13 @@ struct mt792x_phy { wait_queue_head_t roc_wait; u8 roc_token_id; bool roc_grant; + + /* ROC rate limiting to prevent MCU overload during rapid reconnection + * cycles (e.g., MLO authentication failures causing repeated ROC). + */ + u8 roc_timeout_count; /* consecutive ROC timeouts */ + unsigned long roc_last_timeout; /* jiffies of last timeout */ + unsigned long roc_backoff_until;/* don't issue ROC until this time */ }; struct mt792x_irq_map { -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac @ 2026-01-20 8:25 ` Sean Wang 2026-01-20 17:59 ` Zac Bowling 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac 2026-01-20 11:42 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions kernel test robot 2026-01-20 13:26 ` kernel test robot 2 siblings, 2 replies; 113+ messages in thread From: Sean Wang @ 2026-01-20 8:25 UTC (permalink / raw) To: Zac Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling On Tue, Jan 20, 2026 at 12:29 AM Zac <zac@zacbowling.com> wrote: > > From: Zac Bowling <zac@zacbowling.com> > > Fix multiple interrelated issues in the remain-on-channel (ROC) handling > that cause deadlocks, race conditions, and resource leaks. > > Problems fixed: > > 1. Deadlock in sta removal ROC abort path: > When a station is removed while a ROC operation is in progress, the > driver would call mt7925_roc_abort_sync() which waits for ROC completion. > However, the ROC work itself needs to acquire mt792x_mutex which is > already held during station removal, causing a deadlock. > > Fix: Use async ROC abort (mt76_connac_mcu_abort_roc) when called from > paths that already hold the mutex, and add MT76_STATE_ROC_ABORT flag > to coordinate between the abort and the ROC timer. > Hi Zac, Thanks for your continued efforts on the driver. We’ve sent a patch to address the mt7925 deadlock at the link below: https://lists.infradead.org/pipermail/linux-mediatek/2025-December/102164.html We plan to send the same fix to mt7921 as well. I had a couple of questions and suggestions: 1. Would it be possible to rebase your patchset on top of this fix (and any other pending patches that are not yet merged)? We noticed some conflicts when applying the series, and rebasing it this way would make it easier for nbd to integrate the full patchset. 2. Could you please elaborate on the test scenarios that would trigger ROC rate limiting for MLO authentication failures? If I recall correctly, ROC operations are typically handled sequentially unless multiple interfaces are created on the same physical device. In that case, how many virtual interfaces and which operating modes (GC/STA or multiple STAs) are required to reproduce the issue? I will try to prepare an out-of-tree branch with the current pending patches to help your patchset integrate more smoothly. Thanks for collecting community issues and fixes and incorporating them into the driver. Sean > 2. ROC timer race during suspend: > The ROC timer could fire after the device started suspending but before > the ROC was properly aborted, causing undefined behavior. > > Fix: Delete ROC timer synchronously before suspend and check device > state before processing ROC timeout. > > 3. ROC rate limiting for MLO auth failures: > Rapid ROC requests during MLO authentication can overwhelm the firmware, > causing authentication timeouts. The MT7925 firmware has limited ROC > handling capacity. > > Fix: Add rate limiting infrastructure with configurable minimum interval > between ROC requests. Track last ROC completion time and defer new > requests if they arrive too quickly. > > 4. WCID leak in ROC cleanup: > When ROC operations are aborted, the associated WCID resources were > not being properly released, causing resource exhaustion over time. > > Fix: Ensure WCID cleanup happens in all ROC termination paths. > > 5. Async ROC abort race condition: > The async ROC abort could race with normal ROC completion, causing > double-free or use-after-free of ROC resources. > > Fix: Use MT76_STATE_ROC_ABORT flag and proper synchronization to > prevent races between async abort and normal completion paths. > > These fixes work together to provide robust ROC handling that doesn't > deadlock, properly releases resources, and handles edge cases during > suspend and MLO operations. > > Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") > Signed-off-by: Zac Bowling <zac@zacbowling.com> > --- > drivers/net/wireless/mediatek/mt76/mt76.h | 1 + > .../net/wireless/mediatek/mt76/mt7925/main.c | 175 ++++++++++++++++-- > drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + > 3 files changed, 170 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h > index d05e83ea1cac..91f9dd95c89e 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt76.h > +++ b/drivers/net/wireless/mediatek/mt76/mt76.h > @@ -511,6 +511,7 @@ enum { > MT76_STATE_POWER_OFF, > MT76_STATE_SUSPEND, > MT76_STATE_ROC, > + MT76_STATE_ROC_ABORT, > MT76_STATE_PM, > MT76_STATE_WED_RESET, > }; > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > index cc7ef2c17032..2404f7812897 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > @@ -453,6 +453,24 @@ static void mt7925_roc_iter(void *priv, u8 *mac, > mt7925_mcu_abort_roc(phy, &mvif->bss_conf, phy->roc_token_id); > } > > +/* Async ROC abort - safe to call while holding mutex. > + * Sets abort flag and lets roc_work handle cleanup without blocking. > + * This prevents deadlock when called from sta_remove path which holds mutex. > + */ > +static void mt7925_roc_abort_async(struct mt792x_dev *dev) > +{ > + struct mt792x_phy *phy = &dev->phy; > + > + /* Set abort flag - roc_work checks this before acquiring mutex */ > + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > + > + /* Stop timer and schedule work to handle cleanup. > + * Must schedule work since timer may not have fired yet. > + */ > + timer_delete(&phy->roc_timer); > + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); > +} > + > void mt7925_roc_abort_sync(struct mt792x_dev *dev) > { > struct mt792x_phy *phy = &dev->phy; > @@ -473,6 +491,17 @@ void mt7925_roc_work(struct work_struct *work) > phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, > roc_work); > > + /* Check abort flag BEFORE acquiring mutex to prevent deadlock. > + * If abort is requested while we're in the sta_remove path (which > + * holds the mutex), we must not try to acquire it or we'll deadlock. > + * Clear the flags and only notify mac80211 if ROC was actually active. > + */ > + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { > + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > + ieee80211_remain_on_channel_expired(phy->mt76->hw); > + return; > + } > + > if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > return; > > @@ -500,14 +529,93 @@ static int mt7925_abort_roc(struct mt792x_phy *phy, > return err; > } > > +/* ROC rate limiting constants - exponential backoff to prevent MCU overload > + * when upper layers trigger rapid reconnection cycles (e.g., MLO auth failures). > + * Max backoff ~1.6s, resets after 10s of no timeouts. > + */ > +#define MT7925_ROC_BACKOFF_BASE_MS 100 > +#define MT7925_ROC_BACKOFF_MAX_MS 1600 > +#define MT7925_ROC_TIMEOUT_RESET_MS 10000 > +#define MT7925_ROC_TIMEOUT_WARN_THRESH 5 > + > +/* Check if ROC should be throttled due to recent timeouts. > + * Returns delay in jiffies if throttling, 0 if OK to proceed. > + */ > +static unsigned long mt7925_roc_throttle_check(struct mt792x_phy *phy) > +{ > + unsigned long now = jiffies; > + > + /* Reset timeout counter if it's been a while since last timeout */ > + if (phy->roc_timeout_count && > + time_after(now, phy->roc_last_timeout + > + msecs_to_jiffies(MT7925_ROC_TIMEOUT_RESET_MS))) { > + phy->roc_timeout_count = 0; > + phy->roc_backoff_until = 0; > + } > + > + /* Check if we're still in backoff period */ > + if (phy->roc_backoff_until && time_before(now, phy->roc_backoff_until)) > + return phy->roc_backoff_until - now; > + > + return 0; > +} > + > +/* Record ROC timeout and calculate backoff period */ > +static void mt7925_roc_record_timeout(struct mt792x_phy *phy) > +{ > + unsigned int backoff_ms; > + > + phy->roc_last_timeout = jiffies; > + phy->roc_timeout_count++; > + > + /* Exponential backoff: 100ms, 200ms, 400ms, 800ms, 1600ms (capped) */ > + backoff_ms = MT7925_ROC_BACKOFF_BASE_MS << > + min_t(u8, phy->roc_timeout_count - 1, 4); > + if (backoff_ms > MT7925_ROC_BACKOFF_MAX_MS) > + backoff_ms = MT7925_ROC_BACKOFF_MAX_MS; > + > + phy->roc_backoff_until = jiffies + msecs_to_jiffies(backoff_ms); > + > + /* Warn if we're seeing repeated timeouts - likely upper layer issue */ > + if (phy->roc_timeout_count == MT7925_ROC_TIMEOUT_WARN_THRESH) > + dev_warn(phy->dev->mt76.dev, > + "mt7925: %u consecutive ROC timeouts, possible mac80211/wpa_supplicant issue (MLO key race?)\n", > + phy->roc_timeout_count); > +} > + > +/* Clear timeout tracking on successful ROC */ > +static void mt7925_roc_clear_timeout(struct mt792x_phy *phy) > +{ > + phy->roc_timeout_count = 0; > + phy->roc_backoff_until = 0; > +} > + > static int mt7925_set_roc(struct mt792x_phy *phy, > struct mt792x_bss_conf *mconf, > struct ieee80211_channel *chan, > int duration, > enum mt7925_roc_req type) > { > + unsigned long throttle; > int err; > > + /* Check rate limiting - if in backoff period, wait or return busy */ > + throttle = mt7925_roc_throttle_check(phy); > + if (throttle) { > + /* For short backoffs, wait; for longer ones, return busy */ > + if (throttle < msecs_to_jiffies(200)) { > + msleep(jiffies_to_msecs(throttle)); > + } else { > + dev_dbg(phy->dev->mt76.dev, > + "mt7925: ROC throttled, %lu ms remaining\n", > + jiffies_to_msecs(throttle)); > + return -EBUSY; > + } > + } > + > + /* Clear stale abort flag from previous ROC */ > + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > + > if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) > return -EBUSY; > > @@ -523,7 +631,11 @@ static int mt7925_set_roc(struct mt792x_phy *phy, > if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { > mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); > clear_bit(MT76_STATE_ROC, &phy->mt76->state); > + mt7925_roc_record_timeout(phy); > err = -ETIMEDOUT; > + } else { > + /* Successful ROC - reset timeout tracking */ > + mt7925_roc_clear_timeout(phy); > } > > out: > @@ -534,8 +646,27 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, > struct mt792x_bss_conf *mconf, > u16 sel_links) > { > + unsigned long throttle; > int err; > > + /* Check rate limiting - MLO ROC is especially prone to rapid-fire > + * during reconnection cycles after MLO authentication failures. > + */ > + throttle = mt7925_roc_throttle_check(phy); > + if (throttle) { > + if (throttle < msecs_to_jiffies(200)) { > + msleep(jiffies_to_msecs(throttle)); > + } else { > + dev_dbg(phy->dev->mt76.dev, > + "mt7925: MLO ROC throttled, %lu ms remaining\n", > + jiffies_to_msecs(throttle)); > + return -EBUSY; > + } > + } > + > + /* Clear stale abort flag from previous ROC */ > + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > + > if (WARN_ON_ONCE(test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state))) > return -EBUSY; > > @@ -550,7 +681,10 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, > if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { > mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); > clear_bit(MT76_STATE_ROC, &phy->mt76->state); > + mt7925_roc_record_timeout(phy); > err = -ETIMEDOUT; > + } else { > + mt7925_roc_clear_timeout(phy); > } > > out: > @@ -567,6 +701,7 @@ static int mt7925_remain_on_channel(struct ieee80211_hw *hw, > struct mt792x_phy *phy = mt792x_hw_phy(hw); > int err; > > + cancel_work_sync(&phy->roc_work); > mt792x_mutex_acquire(phy->dev); > err = mt7925_set_roc(phy, &mvif->bss_conf, > chan, duration, MT7925_ROC_REQ_ROC); > @@ -874,14 +1009,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > if (!mlink) > return -EINVAL; > > - idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > - if (idx < 0) > - return -ENOSPC; > - > mconf = mt792x_vif_to_link(mvif, link_id); > if (!mconf) > return -EINVAL; > > + idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > + if (idx < 0) > + return -ENOSPC; > + > mt76_wcid_init(&mlink->wcid, 0); > mlink->wcid.sta = 1; > mlink->wcid.idx = idx; > @@ -901,14 +1036,16 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); > if (ret) > - return ret; > + goto err_wcid; > > mt7925_mac_wtbl_update(dev, idx, > MT_WTBL_UPDATE_ADM_COUNT_CLEAR); > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > - if (!link_conf) > - return -EINVAL; > + if (!link_conf) { > + ret = -EINVAL; > + goto err_wcid; > + } > > /* should update bss info before STA add */ > if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > @@ -920,7 +1057,7 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > link_conf, link_sta, false); > if (ret) > - return ret; > + goto err_wcid; > } > > if (ieee80211_vif_is_mld(vif) && > @@ -928,28 +1065,34 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > MT76_STA_INFO_STATE_NONE); > if (ret) > - return ret; > + goto err_wcid; > } else if (ieee80211_vif_is_mld(vif) && > link_sta != mlink->pri_link) { > ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, > true, MT76_STA_INFO_STATE_ASSOC); > if (ret) > - return ret; > + goto err_wcid; > > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > MT76_STA_INFO_STATE_ASSOC); > if (ret) > - return ret; > + goto err_wcid; > } else { > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > MT76_STA_INFO_STATE_NONE); > if (ret) > - return ret; > + goto err_wcid; > } > > mt76_connac_power_save_sched(&dev->mphy, &dev->pm); > > return 0; > + > +err_wcid: > + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); > + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); > + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); > + return ret; > } > > static int > @@ -1135,7 +1278,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > if (!mlink) > return; > > - mt7925_roc_abort_sync(dev); > + /* Async abort - caller already holds mutex */ > + mt7925_roc_abort_async(dev); > > mt76_connac_free_pending_tx_skbs(&dev->pm, &mlink->wcid); > mt76_connac_pm_wake(&dev->mphy, &dev->pm); > @@ -1530,6 +1674,8 @@ static int mt7925_suspend(struct ieee80211_hw *hw, > cancel_delayed_work_sync(&dev->pm.ps_work); > mt76_connac_free_pending_tx_skbs(&dev->pm, NULL); > > + /* Cancel ROC before quiescing starts */ > + mt7925_roc_abort_sync(dev); > mt792x_mutex_acquire(dev); > > clear_bit(MT76_STATE_RUNNING, &phy->mt76->state); > @@ -1876,6 +2022,8 @@ static void mt7925_mgd_prepare_tx(struct ieee80211_hw *hw, > u16 duration = info->duration ? info->duration : > jiffies_to_msecs(HZ); > > + cancel_work_sync(&mvif->phy->roc_work); > + > mt792x_mutex_acquire(dev); > mt7925_set_roc(mvif->phy, &mvif->bss_conf, > mvif->bss_conf.mt76.ctx->def.chan, duration, > @@ -2033,6 +2181,7 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > if (old_links == new_links) > return 0; > > + cancel_work_sync(&phy->roc_work); > mt792x_mutex_acquire(dev); > > for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x.h b/drivers/net/wireless/mediatek/mt76/mt792x.h > index 8388638ed550..d9c1ea709390 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt792x.h > +++ b/drivers/net/wireless/mediatek/mt76/mt792x.h > @@ -186,6 +186,13 @@ struct mt792x_phy { > wait_queue_head_t roc_wait; > u8 roc_token_id; > bool roc_grant; > + > + /* ROC rate limiting to prevent MCU overload during rapid reconnection > + * cycles (e.g., MLO authentication failures causing repeated ROC). > + */ > + u8 roc_timeout_count; /* consecutive ROC timeouts */ > + unsigned long roc_last_timeout; /* jiffies of last timeout */ > + unsigned long roc_backoff_until;/* don't issue ROC until this time */ > }; > > struct mt792x_irq_map { > -- > 2.52.0 > ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 8:25 ` Sean Wang @ 2026-01-20 17:59 ` Zac Bowling 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac 1 sibling, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-20 17:59 UTC (permalink / raw) To: Sean Wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux Hi Sean, Thank you for the detailed feedback and for sharing your deadlock fix. > 1. Would it be possible to rebase your patchset on top of this fix Yes, I'll rebase on your patch. I reviewed it, and it's a cleaner solution than what I implemented. My approach used an async abort with a state flag, but your `cancel_work()` approach avoids the blocking entirely. Additionally, last night, after someone ran an AI bot check on my patches, I found two issues in my current patchset that introduce deadlocks where your existing patch stops it from hitting. 1. In my patch #3 added I mt792x_mutex_acquire() around ieee80211_iterate_active_interfaces(), but this function is called from mt7921_mac_sta_add/remove via mt76_sta_add/remove, which already hold dev->mutex. I need to remove this mutex wrapper. 2. In my patch #6 I wrapped mt7925_roc_abort_sync() with a mutex in the suspend path, but roc_abort_sync calls cancel_work_sync() which can deadlock if roc_work is waiting for the mutex. Your fix addresses this more elegantly. I'll prepare a v6 rebased on your patch with these fixes. > 2. Could you please elaborate on the test scenarios that would trigger > ROC rate limiting for MLO authentication failures? The rate-limiting addresses a real-world scenario we observed with MT7925 when connecting to WiFi 7 APs with MLO + Fast Transition (802.11r) enabled. When wpa_supplicant attempts Fast Transition roaming between MLO-capable APs, there's a race condition between disconnect and key setup. The kernel's nl80211 validation requires link_id for MLO group keys (net/wireless/nl80211.c:4828), but during FT roaming, wdev->valid_links may still be set from the previous connection when the new key setup begins. This causes repeated failures: ``` wpa_supplicant: nl80211: kernel reports: link ID must for MLO group key wpa_supplicant: FT: Failed to set PTK to the driver ``` Each failure triggers a reconnection attempt, which requires ROC commands for scanning. When these failures happen in rapid succession (we observed 3-4 failures within seconds), the MCU seems to become overwhelmed with messages like this: ``` Message 00020027 (MCU_UNI_CMD_ROC) timeout Message 00020027 (MCU_UNI_CMD_ROC) timeout Message 00020027 (MCU_UNI_CMD_ROC) timeout ``` This leads to firmware reset, which triggers more reconnection attempts, creating a cascading failure loop. Reproduction manifests for me at least with: - Single MT7925 interface in STA mode - WiFi 7 AP with MLO enabled (multi-link across 5 GHz + 6 GHz) - 802.11r (Fast Transition) enabled - Multiple APs with the same SSID (roaming scenario) I haven't tested with multiple virtual interfaces, but the core issue is the rapid ROC request rate during the reconnection loop, not the number of interfaces. I had someone on the Framework forum post similar dumps showing similar behavior with their Eeros mesh setup. I'm using some Unifi U7 Pros with MLO enabled on one of the SSIDs. So this might not be the right place to fix this. We may need to fix at the upper-layers. I put this here so folks could work around with my DKMS package, but a deeper refactor up multiple layers around MLO is probably needed to really fix this. Fixing here at least validates things are more stable (but I can't confirm it's really fixed, I don't know what is going on inside the firmware, and it's internal state issues we can get into). The root cause is likely in wpa_supplicant/mac80211 (race condition in MLO key setup timing during FT roaming). However, the rate limiting provides a defensive measure to prevent firmware crashes. Then I can maybe investigate the upper-layer issues. Way bigger change, though, unfortunately. Fix is similar to how TCP implements backoff to handle network congestion - the congestion isn't TCP's fault, but the backoff prevents cascading failures. The detailed crash analysis and dmesg logs are in our repository: https://github.com/zbowling/mt7925/tree/main/crashes Specifically: - crash-2026-01-19-mlo-authentication-failure.log (MLO key race analysis) - crash-2026-01-12-2210-auth-loop-mcu-timeout.log (MCU timeout during auth loop) If you believe the rate limiting is unnecessary given how ROC operations are serialized in the firmware, I can remove it. My goal was to prevent the firmware from entering a reset loop, but if there's a better approach or if the underlying mac80211/wpa_supplicant issue should be fixed instead now, I'm happy to adjust. This just seemed to reduce the issue for my MLO setup. Thank you for offering to prepare an out-of-tree branch - that would be very helpful for testing the integrated patchset. Zac Bowling On Tue, Jan 20, 2026 at 12:25 AM Sean Wang <sean.wang@kernel.org> wrote: > > On Tue, Jan 20, 2026 at 12:29 AM Zac <zac@zacbowling.com> wrote: > > > > From: Zac Bowling <zac@zacbowling.com> > > > > Fix multiple interrelated issues in the remain-on-channel (ROC) handling > > that cause deadlocks, race conditions, and resource leaks. > > > > Problems fixed: > > > > 1. Deadlock in sta removal ROC abort path: > > When a station is removed while a ROC operation is in progress, the > > driver would call mt7925_roc_abort_sync() which waits for ROC completion. > > However, the ROC work itself needs to acquire mt792x_mutex which is > > already held during station removal, causing a deadlock. > > > > Fix: Use async ROC abort (mt76_connac_mcu_abort_roc) when called from > > paths that already hold the mutex, and add MT76_STATE_ROC_ABORT flag > > to coordinate between the abort and the ROC timer. > > > > Hi Zac, > > Thanks for your continued efforts on the driver. > We’ve sent a patch to address the mt7925 deadlock at the link below: > https://lists.infradead.org/pipermail/linux-mediatek/2025-December/102164.html > We plan to send the same fix to mt7921 as well. > > I had a couple of questions and suggestions: > 1. Would it be possible to rebase your patchset on top of this fix > (and any other pending patches that are not yet merged)? We noticed > some conflicts when applying the series, and rebasing it this way > would make it easier for nbd to integrate the full patchset. > 2. Could you please elaborate on the test scenarios that would trigger > ROC rate limiting for MLO authentication failures? If I recall > correctly, ROC operations are typically handled sequentially unless > multiple interfaces are created on the same physical device. In that > case, how many virtual interfaces and which operating modes (GC/STA or > multiple STAs) are required to reproduce the issue? > > I will try to prepare an out-of-tree branch with the current pending > patches to help your patchset integrate more smoothly. Thanks for > collecting community issues and fixes and incorporating them into the > driver. > > Sean > > > 2. ROC timer race during suspend: > > The ROC timer could fire after the device started suspending but before > > the ROC was properly aborted, causing undefined behavior. > > > > Fix: Delete ROC timer synchronously before suspend and check device > > state before processing ROC timeout. > > > > 3. ROC rate limiting for MLO auth failures: > > Rapid ROC requests during MLO authentication can overwhelm the firmware, > > causing authentication timeouts. The MT7925 firmware has limited ROC > > handling capacity. > > > > Fix: Add rate limiting infrastructure with configurable minimum interval > > between ROC requests. Track last ROC completion time and defer new > > requests if they arrive too quickly. > > > > 4. WCID leak in ROC cleanup: > > When ROC operations are aborted, the associated WCID resources were > > not being properly released, causing resource exhaustion over time. > > > > Fix: Ensure WCID cleanup happens in all ROC termination paths. > > > > 5. Async ROC abort race condition: > > The async ROC abort could race with normal ROC completion, causing > > double-free or use-after-free of ROC resources. > > > > Fix: Use MT76_STATE_ROC_ABORT flag and proper synchronization to > > prevent races between async abort and normal completion paths. > > > > These fixes work together to provide robust ROC handling that doesn't > > deadlock, properly releases resources, and handles edge cases during > > suspend and MLO operations. > > > > Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > --- > > drivers/net/wireless/mediatek/mt76/mt76.h | 1 + > > .../net/wireless/mediatek/mt76/mt7925/main.c | 175 ++++++++++++++++-- > > drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + > > 3 files changed, 170 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h > > index d05e83ea1cac..91f9dd95c89e 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt76.h > > +++ b/drivers/net/wireless/mediatek/mt76/mt76.h > > @@ -511,6 +511,7 @@ enum { > > MT76_STATE_POWER_OFF, > > MT76_STATE_SUSPEND, > > MT76_STATE_ROC, > > + MT76_STATE_ROC_ABORT, > > MT76_STATE_PM, > > MT76_STATE_WED_RESET, > > }; > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > index cc7ef2c17032..2404f7812897 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c > > @@ -453,6 +453,24 @@ static void mt7925_roc_iter(void *priv, u8 *mac, > > mt7925_mcu_abort_roc(phy, &mvif->bss_conf, phy->roc_token_id); > > } > > > > +/* Async ROC abort - safe to call while holding mutex. > > + * Sets abort flag and lets roc_work handle cleanup without blocking. > > + * This prevents deadlock when called from sta_remove path which holds mutex. > > + */ > > +static void mt7925_roc_abort_async(struct mt792x_dev *dev) > > +{ > > + struct mt792x_phy *phy = &dev->phy; > > + > > + /* Set abort flag - roc_work checks this before acquiring mutex */ > > + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > > + > > + /* Stop timer and schedule work to handle cleanup. > > + * Must schedule work since timer may not have fired yet. > > + */ > > + timer_delete(&phy->roc_timer); > > + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); > > +} > > + > > void mt7925_roc_abort_sync(struct mt792x_dev *dev) > > { > > struct mt792x_phy *phy = &dev->phy; > > @@ -473,6 +491,17 @@ void mt7925_roc_work(struct work_struct *work) > > phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, > > roc_work); > > > > + /* Check abort flag BEFORE acquiring mutex to prevent deadlock. > > + * If abort is requested while we're in the sta_remove path (which > > + * holds the mutex), we must not try to acquire it or we'll deadlock. > > + * Clear the flags and only notify mac80211 if ROC was actually active. > > + */ > > + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { > > + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > > + ieee80211_remain_on_channel_expired(phy->mt76->hw); > > + return; > > + } > > + > > if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > > return; > > > > @@ -500,14 +529,93 @@ static int mt7925_abort_roc(struct mt792x_phy *phy, > > return err; > > } > > > > +/* ROC rate limiting constants - exponential backoff to prevent MCU overload > > + * when upper layers trigger rapid reconnection cycles (e.g., MLO auth failures). > > + * Max backoff ~1.6s, resets after 10s of no timeouts. > > + */ > > +#define MT7925_ROC_BACKOFF_BASE_MS 100 > > +#define MT7925_ROC_BACKOFF_MAX_MS 1600 > > +#define MT7925_ROC_TIMEOUT_RESET_MS 10000 > > +#define MT7925_ROC_TIMEOUT_WARN_THRESH 5 > > + > > +/* Check if ROC should be throttled due to recent timeouts. > > + * Returns delay in jiffies if throttling, 0 if OK to proceed. > > + */ > > +static unsigned long mt7925_roc_throttle_check(struct mt792x_phy *phy) > > +{ > > + unsigned long now = jiffies; > > + > > + /* Reset timeout counter if it's been a while since last timeout */ > > + if (phy->roc_timeout_count && > > + time_after(now, phy->roc_last_timeout + > > + msecs_to_jiffies(MT7925_ROC_TIMEOUT_RESET_MS))) { > > + phy->roc_timeout_count = 0; > > + phy->roc_backoff_until = 0; > > + } > > + > > + /* Check if we're still in backoff period */ > > + if (phy->roc_backoff_until && time_before(now, phy->roc_backoff_until)) > > + return phy->roc_backoff_until - now; > > + > > + return 0; > > +} > > + > > +/* Record ROC timeout and calculate backoff period */ > > +static void mt7925_roc_record_timeout(struct mt792x_phy *phy) > > +{ > > + unsigned int backoff_ms; > > + > > + phy->roc_last_timeout = jiffies; > > + phy->roc_timeout_count++; > > + > > + /* Exponential backoff: 100ms, 200ms, 400ms, 800ms, 1600ms (capped) */ > > + backoff_ms = MT7925_ROC_BACKOFF_BASE_MS << > > + min_t(u8, phy->roc_timeout_count - 1, 4); > > + if (backoff_ms > MT7925_ROC_BACKOFF_MAX_MS) > > + backoff_ms = MT7925_ROC_BACKOFF_MAX_MS; > > + > > + phy->roc_backoff_until = jiffies + msecs_to_jiffies(backoff_ms); > > + > > + /* Warn if we're seeing repeated timeouts - likely upper layer issue */ > > + if (phy->roc_timeout_count == MT7925_ROC_TIMEOUT_WARN_THRESH) > > + dev_warn(phy->dev->mt76.dev, > > + "mt7925: %u consecutive ROC timeouts, possible mac80211/wpa_supplicant issue (MLO key race?)\n", > > + phy->roc_timeout_count); > > +} > > + > > +/* Clear timeout tracking on successful ROC */ > > +static void mt7925_roc_clear_timeout(struct mt792x_phy *phy) > > +{ > > + phy->roc_timeout_count = 0; > > + phy->roc_backoff_until = 0; > > +} > > + > > static int mt7925_set_roc(struct mt792x_phy *phy, > > struct mt792x_bss_conf *mconf, > > struct ieee80211_channel *chan, > > int duration, > > enum mt7925_roc_req type) > > { > > + unsigned long throttle; > > int err; > > > > + /* Check rate limiting - if in backoff period, wait or return busy */ > > + throttle = mt7925_roc_throttle_check(phy); > > + if (throttle) { > > + /* For short backoffs, wait; for longer ones, return busy */ > > + if (throttle < msecs_to_jiffies(200)) { > > + msleep(jiffies_to_msecs(throttle)); > > + } else { > > + dev_dbg(phy->dev->mt76.dev, > > + "mt7925: ROC throttled, %lu ms remaining\n", > > + jiffies_to_msecs(throttle)); > > + return -EBUSY; > > + } > > + } > > + > > + /* Clear stale abort flag from previous ROC */ > > + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > > + > > if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) > > return -EBUSY; > > > > @@ -523,7 +631,11 @@ static int mt7925_set_roc(struct mt792x_phy *phy, > > if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { > > mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); > > clear_bit(MT76_STATE_ROC, &phy->mt76->state); > > + mt7925_roc_record_timeout(phy); > > err = -ETIMEDOUT; > > + } else { > > + /* Successful ROC - reset timeout tracking */ > > + mt7925_roc_clear_timeout(phy); > > } > > > > out: > > @@ -534,8 +646,27 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, > > struct mt792x_bss_conf *mconf, > > u16 sel_links) > > { > > + unsigned long throttle; > > int err; > > > > + /* Check rate limiting - MLO ROC is especially prone to rapid-fire > > + * during reconnection cycles after MLO authentication failures. > > + */ > > + throttle = mt7925_roc_throttle_check(phy); > > + if (throttle) { > > + if (throttle < msecs_to_jiffies(200)) { > > + msleep(jiffies_to_msecs(throttle)); > > + } else { > > + dev_dbg(phy->dev->mt76.dev, > > + "mt7925: MLO ROC throttled, %lu ms remaining\n", > > + jiffies_to_msecs(throttle)); > > + return -EBUSY; > > + } > > + } > > + > > + /* Clear stale abort flag from previous ROC */ > > + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); > > + > > if (WARN_ON_ONCE(test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state))) > > return -EBUSY; > > > > @@ -550,7 +681,10 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, > > if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { > > mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); > > clear_bit(MT76_STATE_ROC, &phy->mt76->state); > > + mt7925_roc_record_timeout(phy); > > err = -ETIMEDOUT; > > + } else { > > + mt7925_roc_clear_timeout(phy); > > } > > > > out: > > @@ -567,6 +701,7 @@ static int mt7925_remain_on_channel(struct ieee80211_hw *hw, > > struct mt792x_phy *phy = mt792x_hw_phy(hw); > > int err; > > > > + cancel_work_sync(&phy->roc_work); > > mt792x_mutex_acquire(phy->dev); > > err = mt7925_set_roc(phy, &mvif->bss_conf, > > chan, duration, MT7925_ROC_REQ_ROC); > > @@ -874,14 +1009,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > if (!mlink) > > return -EINVAL; > > > > - idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > > - if (idx < 0) > > - return -ENOSPC; > > - > > mconf = mt792x_vif_to_link(mvif, link_id); > > if (!mconf) > > return -EINVAL; > > > > + idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); > > + if (idx < 0) > > + return -ENOSPC; > > + > > mt76_wcid_init(&mlink->wcid, 0); > > mlink->wcid.sta = 1; > > mlink->wcid.idx = idx; > > @@ -901,14 +1036,16 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > > > ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); > > if (ret) > > - return ret; > > + goto err_wcid; > > > > mt7925_mac_wtbl_update(dev, idx, > > MT_WTBL_UPDATE_ADM_COUNT_CLEAR); > > > > link_conf = mt792x_vif_to_bss_conf(vif, link_id); > > - if (!link_conf) > > - return -EINVAL; > > + if (!link_conf) { > > + ret = -EINVAL; > > + goto err_wcid; > > + } > > > > /* should update bss info before STA add */ > > if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { > > @@ -920,7 +1057,7 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, > > link_conf, link_sta, false); > > if (ret) > > - return ret; > > + goto err_wcid; > > } > > > > if (ieee80211_vif_is_mld(vif) && > > @@ -928,28 +1065,34 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, > > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > > MT76_STA_INFO_STATE_NONE); > > if (ret) > > - return ret; > > + goto err_wcid; > > } else if (ieee80211_vif_is_mld(vif) && > > link_sta != mlink->pri_link) { > > ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, > > true, MT76_STA_INFO_STATE_ASSOC); > > if (ret) > > - return ret; > > + goto err_wcid; > > > > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > > MT76_STA_INFO_STATE_ASSOC); > > if (ret) > > - return ret; > > + goto err_wcid; > > } else { > > ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, > > MT76_STA_INFO_STATE_NONE); > > if (ret) > > - return ret; > > + goto err_wcid; > > } > > > > mt76_connac_power_save_sched(&dev->mphy, &dev->pm); > > > > return 0; > > + > > +err_wcid: > > + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); > > + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); > > + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); > > + return ret; > > } > > > > static int > > @@ -1135,7 +1278,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, > > if (!mlink) > > return; > > > > - mt7925_roc_abort_sync(dev); > > + /* Async abort - caller already holds mutex */ > > + mt7925_roc_abort_async(dev); > > > > mt76_connac_free_pending_tx_skbs(&dev->pm, &mlink->wcid); > > mt76_connac_pm_wake(&dev->mphy, &dev->pm); > > @@ -1530,6 +1674,8 @@ static int mt7925_suspend(struct ieee80211_hw *hw, > > cancel_delayed_work_sync(&dev->pm.ps_work); > > mt76_connac_free_pending_tx_skbs(&dev->pm, NULL); > > > > + /* Cancel ROC before quiescing starts */ > > + mt7925_roc_abort_sync(dev); > > mt792x_mutex_acquire(dev); > > > > clear_bit(MT76_STATE_RUNNING, &phy->mt76->state); > > @@ -1876,6 +2022,8 @@ static void mt7925_mgd_prepare_tx(struct ieee80211_hw *hw, > > u16 duration = info->duration ? info->duration : > > jiffies_to_msecs(HZ); > > > > + cancel_work_sync(&mvif->phy->roc_work); > > + > > mt792x_mutex_acquire(dev); > > mt7925_set_roc(mvif->phy, &mvif->bss_conf, > > mvif->bss_conf.mt76.ctx->def.chan, duration, > > @@ -2033,6 +2181,7 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, > > if (old_links == new_links) > > return 0; > > > > + cancel_work_sync(&phy->roc_work); > > mt792x_mutex_acquire(dev); > > > > for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { > > diff --git a/drivers/net/wireless/mediatek/mt76/mt792x.h b/drivers/net/wireless/mediatek/mt76/mt792x.h > > index 8388638ed550..d9c1ea709390 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt792x.h > > +++ b/drivers/net/wireless/mediatek/mt76/mt792x.h > > @@ -186,6 +186,13 @@ struct mt792x_phy { > > wait_queue_head_t roc_wait; > > u8 roc_token_id; > > bool roc_grant; > > + > > + /* ROC rate limiting to prevent MCU overload during rapid reconnection > > + * cycles (e.g., MLO authentication failures causing repeated ROC). > > + */ > > + u8 roc_timeout_count; /* consecutive ROC timeouts */ > > + unsigned long roc_last_timeout; /* jiffies of last timeout */ > > + unsigned long roc_backoff_until;/* don't issue ROC until this time */ > > }; > > > > struct mt792x_irq_map { > > -- > > 2.52.0 > > ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions 2026-01-20 8:25 ` Sean Wang 2026-01-20 17:59 ` Zac Bowling @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 01/13] wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync Zac ` (12 more replies) 1 sibling, 13 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> TLDR: This series addresses stability issues in both the MT7921 and MT7925 WiFi drivers that cause kernel panics, deadlocks, and system hangs on various systems using these drivers. This v6 series is rebased on Sean Wang's upstream deadlock fix already sent which is now included as patch 01/13. The remaining 12 patches are my stability fixes. Changes since v5: - Rebased on Sean Wang's fix for mt7925_roc_abort_sync deadlock (now patch 1) and removed my work around for the same issue as Sean's fix is better. - Fixed format string warning in patch 12: %lu -> %u for jiffies_to_msecs() return type (caught by kernel test robot) - Added patch 13: fix double wcid initialization race condition - removes duplicate mt76_wcid_init() call that occurred after rcu_assign_pointer(), which could cause list corruption, memory leaks, and race conditions (this is a pre-existing bug in upstream, not introduced by this series) Zac Bowling (12): wifi: mt76: fix list corruption in mt76_wcid_cleanup wifi: mt76: mt792x: fix NULL pointer and firmware reload issues wifi: mt76: mt7921: add mutex protection in critical paths wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO wifi: mt76: mt7925: add mutex protection in critical paths wifi: mt76: mt7925: add MCU command error handling wifi: mt76: mt7925: add lockdep assertions for mutex verification wifi: mt76: mt7925: fix MLO roaming and ROC setup issues wifi: mt76: mt7925: fix BA session teardown during beacon loss wifi: mt76: mt7925: fix ROC deadlocks and race conditions wifi: mt76: mt7925: fix double wcid initialization race condition Sean Wang (1): wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync drivers/net/wireless/mediatek/mt76/mac80211.c | 10 + drivers/net/wireless/mediatek/mt76/mt76.h | 1 + drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 + drivers/net/wireless/mediatek/mt76/mt7921/main.c | 28 ++- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 8 + drivers/net/wireless/mediatek/mt76/mt7925/main.c | 303 ++++++++++++++++++++--- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 48 +++- drivers/net/wireless/mediatek/mt76/mt7925/pci.c | 2 + drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + drivers/net/wireless/mediatek/mt76/mt792x_core.c | 27 +- 10 files changed, 390 insertions(+), 46 deletions(-) -- 2.52.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 01/13] wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 02/13] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac ` (11 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling, Quan Zhou From: Sean Wang <sean.wang@mediatek.com> roc_abort_sync() can deadlock with roc_work(). roc_work() holds dev->mt76.mutex, while cancel_work_sync() waits for roc_work() to finish. If the caller already owns the same mutex, both sides block and no progress is possible. This deadlock can occur during station removal when mt76_sta_state() -> mt76_sta_remove() -> mt7925_mac_sta_remove_link() -> mt7925_mac_link_sta_remove() -> mt7925_roc_abort_sync() invokes cancel_work_sync() while roc_work() is still running and holding dev->mt76.mutex. This avoids the mutex deadlock and preserves exactly-once work ownership. Fixes: 45064d19fd3a ("wifi: mt76: mt7925: fix a potential association failure upon resuming") Co-developed-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Quan Zhou <quan.zhou@mediatek.com> Signed-off-by: Sean Wang <sean.wang@mediatek.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 2d358a96640c..05990455ee7d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -457,12 +457,16 @@ void mt7925_roc_abort_sync(struct mt792x_dev *dev) { struct mt792x_phy *phy = &dev->phy; + if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + return; + timer_delete_sync(&phy->roc_timer); - cancel_work_sync(&phy->roc_work); - if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) - ieee80211_iterate_interfaces(mt76_hw(dev), - IEEE80211_IFACE_ITER_RESUME_ALL, - mt7925_roc_iter, (void *)phy); + + cancel_work(&phy->roc_work); + + ieee80211_iterate_interfaces(mt76_hw(dev), + IEEE80211_IFACE_ITER_RESUME_ALL, + mt7925_roc_iter, (void *)phy); } EXPORT_SYMBOL_GPL(mt7925_roc_abort_sync); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 02/13] wifi: mt76: fix list corruption in mt76_wcid_cleanup 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac 2026-01-20 20:10 ` [PATCH 01/13] wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 03/13] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac ` (10 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> mt76_wcid_cleanup() was not removing wcid entries from sta_poll_list before mt76_reset_device() reinitializes the master list. This leaves stale pointers in wcid->poll_list, causing list corruption when mt76_wcid_add_poll() later checks list_empty() and tries to add the entry back. The fix adds proper cleanup of poll_list in mt76_wcid_cleanup(), matching how tx_list is already handled. This is similar to what mt7996_mac_sta_deinit_link() already does correctly. Fixes list corruption warnings like: list_add corruption. prev->next should be next (ffffffff...) Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mac80211.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index 75772979f438..d0c522909e98 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -1716,6 +1716,16 @@ void mt76_wcid_cleanup(struct mt76_dev *dev, struct mt76_wcid *wcid) idr_destroy(&wcid->pktid); + /* Remove from sta_poll_list to prevent list corruption after reset. + * Without this, mt76_reset_device() reinitializes sta_poll_list but + * leaves wcid->poll_list with stale pointers, causing list corruption + * when mt76_wcid_add_poll() checks list_empty(). + */ + spin_lock_bh(&dev->sta_poll_lock); + if (!list_empty(&wcid->poll_list)) + list_del_init(&wcid->poll_list); + spin_unlock_bh(&dev->sta_poll_lock); + spin_lock_bh(&phy->tx_lock); if (!list_empty(&wcid->tx_list)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 03/13] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac 2026-01-20 20:10 ` [PATCH 01/13] wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync Zac 2026-01-20 20:10 ` [PATCH 02/13] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths Zac ` (9 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> This patch combines two fixes for the shared mt792x code used by both MT7921 and MT7925 drivers: 1. Fix NULL pointer dereference in TX path: Add NULL pointer checks in mt792x_tx() to prevent kernel crashes when transmitting packets during MLO link removal. The function calls mt792x_sta_to_link() which can return NULL if the link is being removed, but the return value was dereferenced without checking. Similarly, the RCU-protected link_conf and link_sta pointers were used without NULL validation. This race can occur when: - A packet is queued for transmission - Concurrently, the link is being removed (mt7925_mac_link_sta_remove) - mt792x_sta_to_link() returns NULL for the removed link - Kernel crashes on wcid = &mlink->wcid dereference Fix by checking mlink, conf, and link_sta before use, freeing the SKB and returning early if any pointer is NULL. 2. Fix firmware reload failure after previous load crash: If the firmware loading process crashes or is interrupted after acquiring the patch semaphore but before releasing it, subsequent firmware load attempts will fail with 'Failed to get patch semaphore'. Apply the same fix from MT7915 (commit 79dd14f): release the patch semaphore before starting firmware load and restart MCU firmware to ensure clean state. Fixes: c74df1c067f2 ("wifi: mt76: mt792x: introduce mt792x-lib module") Fixes: 583204ae70f9 ("wifi: mt76: mt792x: move mt7921_load_firmware in mt792x-lib module") Link: https://github.com/openwrt/mt76/commit/79dd14f2e8161b656341b6653261779199aedbe4 Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt792x_core.c | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt792x_core.c b/drivers/net/wireless/mediatek/mt76/mt792x_core.c index f2ed16feb6c1..05598202b488 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x_core.c +++ b/drivers/net/wireless/mediatek/mt76/mt792x_core.c @@ -95,6 +95,8 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, IEEE80211_TX_CTRL_MLO_LINK); sta = (struct mt792x_sta *)control->sta->drv_priv; mlink = mt792x_sta_to_link(sta, link_id); + if (!mlink) + goto free_skb; wcid = &mlink->wcid; } @@ -113,9 +115,12 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, link_id = wcid->link_id; rcu_read_lock(); conf = rcu_dereference(vif->link_conf[link_id]); - memcpy(hdr->addr2, conf->addr, ETH_ALEN); - link_sta = rcu_dereference(control->sta->link[link_id]); + if (!conf || !link_sta) { + rcu_read_unlock(); + goto free_skb; + } + memcpy(hdr->addr2, conf->addr, ETH_ALEN); memcpy(hdr->addr1, link_sta->addr, ETH_ALEN); if (vif->type == NL80211_IFTYPE_STATION) @@ -136,6 +141,10 @@ void mt792x_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, } mt76_connac_pm_queue_skb(hw, &dev->pm, wcid, skb); + return; + +free_skb: + ieee80211_free_txskb(hw, skb); } EXPORT_SYMBOL_GPL(mt792x_tx); @@ -927,6 +936,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev) { int ret; + /* Release semaphore if taken by previous failed load attempt. + * This prevents "Failed to get patch semaphore" errors when + * recovering from firmware crashes or suspend/resume failures. + */ + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false); + if (ret < 0) + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret); + + /* Always restart MCU to ensure clean state before loading firmware */ + mt76_connac_mcu_restart(&dev->mt76); + + /* Wait for MCU to be ready after restart */ + msleep(100); + ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev)); if (ret) return ret; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (2 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 03/13] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac @ 2026-01-20 20:10 ` Zac 2026-01-27 10:59 ` Felix Fietkau 2026-01-20 20:10 ` [PATCH 05/13] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac ` (8 subsequent siblings) 12 siblings, 1 reply; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Add proper mutex protection for mt7921 driver operations that access hardware state without proper synchronization. This fixes multiple race conditions that can cause system instability. Fixes added: 1. mac.c: mt7921_mac_reset_work() - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex - The vif_connect_iter callback accesses hw_encap state 2. main.c: mt7921_remain_on_channel() - Remove mt792x_mutex_acquire/release around mt7925_set_channel_state() - The function is already called with mutex held from mac80211 - This was causing double-lock deadlock 3. main.c: mt7921_cancel_remain_on_channel() - Remove mt792x_mutex_acquire/release - Function is called from mac80211 with mutex already held 4. pci.c: mt7921_pci_pm_complete() - Remove mt792x_mutex_acquire/release around ieee80211_iterate_active_interfaces - This was causing deadlock as the vif connect iteration tries to acquire the mutex again 5. usb.c: mt7921_usb_pm_complete() - Same fix as pci.c for USB driver path These changes prevent both missing mutex protection and mutex deadlocks in the mt7921 driver. Fixes: 5c14a5f944b9 ("wifi: mt76: mt7921: introduce remain_on_channel support") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/main.c | 9 +++++++++ drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7921/sdio.c | 2 ++ 4 files changed, 15 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c index 03b4960db73f..f5c882e45bbe 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c @@ -693,9 +693,11 @@ void mt7921_mac_reset_work(struct work_struct *work) clear_bit(MT76_RESET, &dev->mphy.state); pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); } diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 5fae9a6e273c..196fcb1e2e94 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -373,6 +373,11 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); + /* Note: caller must hold mutex if ieee80211_iterate_interfaces is + * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) + * already hold the mutex via mt76_sta_remove(). For suspend paths, + * the mutex should be acquired before calling this function. + */ if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, @@ -619,6 +624,7 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + /* Note: caller (debugfs) must hold mutex before calling this function */ ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev); @@ -765,6 +771,9 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) struct mt792x_dev *dev = phy->dev; u32 valid_vif_num = 0; + /* Note: caller (mt7921_mac_sta_add/remove via mt76_sta_add/remove) + * already holds dev->mt76.mutex, so we must not acquire it here. + */ ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_calc_vif_num, &valid_vif_num); diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index ec9686183251..9f76b334b93d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 3421e53dc948..92ea2811816f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,7 +219,9 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); + mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); + mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths 2026-01-20 20:10 ` [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths Zac @ 2026-01-27 10:59 ` Felix Fietkau 2026-01-29 6:19 ` Zac Bowling 0 siblings, 1 reply; 113+ messages in thread From: Felix Fietkau @ 2026-01-27 10:59 UTC (permalink / raw) To: Zac, sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, zbowling On 20.01.26 21:10, Zac wrote: > From: Zac Bowling <zac@zacbowling.com> > > Add proper mutex protection for mt7921 driver operations that access > hardware state without proper synchronization. This fixes multiple race > conditions that can cause system instability. > > Fixes added: > > 1. mac.c: mt7921_mac_reset_work() > - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex > - The vif_connect_iter callback accesses hw_encap state > > 2. main.c: mt7921_remain_on_channel() > - Remove mt792x_mutex_acquire/release around mt7925_set_channel_state() > - The function is already called with mutex held from mac80211 > - This was causing double-lock deadlock > > 3. main.c: mt7921_cancel_remain_on_channel() > - Remove mt792x_mutex_acquire/release > - Function is called from mac80211 with mutex already held > > 4. pci.c: mt7921_pci_pm_complete() > - Remove mt792x_mutex_acquire/release around ieee80211_iterate_active_interfaces > - This was causing deadlock as the vif connect iteration tries > to acquire the mutex again > > 5. usb.c: mt7921_usb_pm_complete() > - Same fix as pci.c for USB driver path Changelog should be below "---" after the commit description, so it doesn't get picked up. > These changes prevent both missing mutex protection and mutex deadlocks > in the mt7921 driver. > > Fixes: 5c14a5f944b9 ("wifi: mt76: mt7921: introduce remain_on_channel support") > Signed-off-by: Zac Bowling <zac@zacbowling.com> > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > index 5fae9a6e273c..196fcb1e2e94 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > @@ -373,6 +373,11 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) > > timer_delete_sync(&phy->roc_timer); > cancel_work_sync(&phy->roc_work); > + /* Note: caller must hold mutex if ieee80211_iterate_interfaces is > + * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) > + * already hold the mutex via mt76_sta_remove(). For suspend paths, > + * the mutex should be acquired before calling this function. > + */ > if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > ieee80211_iterate_interfaces(mt76_hw(dev), > IEEE80211_IFACE_ITER_RESUME_ALL, > @@ -619,6 +624,7 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) > bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); > > pm->enable = pm->enable_user && !monitor; > + /* Note: caller (debugfs) must hold mutex before calling this function */ > ieee80211_iterate_active_interfaces(hw, > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7921_pm_interface_iter, dev); > @@ -765,6 +771,9 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) > struct mt792x_dev *dev = phy->dev; > u32 valid_vif_num = 0; > > + /* Note: caller (mt7921_mac_sta_add/remove via mt76_sta_add/remove) > + * already holds dev->mt76.mutex, so we must not acquire it here. > + */ > ieee80211_iterate_active_interfaces(mt76_hw(dev), > IEEE80211_IFACE_ITER_RESUME_ALL, > mt7921_calc_vif_num, &valid_vif_num); It looks like these comments should be replaced with lockdep_assert_held, so that these assumptions can be verified automatically instead of doing so by hand. > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > index ec9686183251..9f76b334b93d 100644 > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) > cancel_delayed_work_sync(&pm->ps_work); > cancel_work_sync(&pm->wake_work); > > + mt792x_mutex_acquire(dev); > mt7921_roc_abort_sync(dev); > + mt792x_mutex_release(dev); The next patch is removing those... - Felix ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths 2026-01-27 10:59 ` Felix Fietkau @ 2026-01-29 6:19 ` Zac Bowling 0 siblings, 0 replies; 113+ messages in thread From: Zac Bowling @ 2026-01-29 6:19 UTC (permalink / raw) To: Felix Fietkau Cc: sean.wang, deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang You are right. I caught that too. After reordering, when I went from the 24-something patch version of this series earlier this month to this smaller 11-patch series, to make it easier to follow again, that happened. It's already gone in my new v7 series. We lock somewhere else up in the stack now. I'm cleaning up this whole stack again, dropping the ROC_ABORT back off hack, because I think actually the solution isn't at this layer at all, but possibly in the mac80211 layer. Zac Bowling On Tue, Jan 27, 2026 at 2:59 AM Felix Fietkau <nbd@nbd.name> wrote: > > On 20.01.26 21:10, Zac wrote: > > From: Zac Bowling <zac@zacbowling.com> > > > > Add proper mutex protection for mt7921 driver operations that access > > hardware state without proper synchronization. This fixes multiple race > > conditions that can cause system instability. > > > > Fixes added: > > > > 1. mac.c: mt7921_mac_reset_work() > > - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex > > - The vif_connect_iter callback accesses hw_encap state > > > > 2. main.c: mt7921_remain_on_channel() > > - Remove mt792x_mutex_acquire/release around mt7925_set_channel_state() > > - The function is already called with mutex held from mac80211 > > - This was causing double-lock deadlock > > > > 3. main.c: mt7921_cancel_remain_on_channel() > > - Remove mt792x_mutex_acquire/release > > - Function is called from mac80211 with mutex already held > > > > 4. pci.c: mt7921_pci_pm_complete() > > - Remove mt792x_mutex_acquire/release around ieee80211_iterate_active_interfaces > > - This was causing deadlock as the vif connect iteration tries > > to acquire the mutex again > > > > 5. usb.c: mt7921_usb_pm_complete() > > - Same fix as pci.c for USB driver path > Changelog should be below "---" after the commit description, so it > doesn't get picked up. > > > These changes prevent both missing mutex protection and mutex deadlocks > > in the mt7921 driver. > > > > Fixes: 5c14a5f944b9 ("wifi: mt76: mt7921: introduce remain_on_channel support") > > Signed-off-by: Zac Bowling <zac@zacbowling.com> > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > index 5fae9a6e273c..196fcb1e2e94 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c > > @@ -373,6 +373,11 @@ void mt7921_roc_abort_sync(struct mt792x_dev *dev) > > > > timer_delete_sync(&phy->roc_timer); > > cancel_work_sync(&phy->roc_work); > > + /* Note: caller must hold mutex if ieee80211_iterate_interfaces is > > + * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) > > + * already hold the mutex via mt76_sta_remove(). For suspend paths, > > + * the mutex should be acquired before calling this function. > > + */ > > if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) > > ieee80211_iterate_interfaces(mt76_hw(dev), > > IEEE80211_IFACE_ITER_RESUME_ALL, > > @@ -619,6 +624,7 @@ void mt7921_set_runtime_pm(struct mt792x_dev *dev) > > bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); > > > > pm->enable = pm->enable_user && !monitor; > > + /* Note: caller (debugfs) must hold mutex before calling this function */ > > ieee80211_iterate_active_interfaces(hw, > > IEEE80211_IFACE_ITER_RESUME_ALL, > > mt7921_pm_interface_iter, dev); > > @@ -765,6 +771,9 @@ mt7921_regd_set_6ghz_power_type(struct ieee80211_vif *vif, bool is_add) > > struct mt792x_dev *dev = phy->dev; > > u32 valid_vif_num = 0; > > > > + /* Note: caller (mt7921_mac_sta_add/remove via mt76_sta_add/remove) > > + * already holds dev->mt76.mutex, so we must not acquire it here. > > + */ > > ieee80211_iterate_active_interfaces(mt76_hw(dev), > > IEEE80211_IFACE_ITER_RESUME_ALL, > > mt7921_calc_vif_num, &valid_vif_num); > > It looks like these comments should be replaced with > lockdep_assert_held, so that these assumptions can be verified > automatically instead of doing so by hand. > > > > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > index ec9686183251..9f76b334b93d 100644 > > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c > > @@ -426,7 +426,9 @@ static int mt7921_pci_suspend(struct device *device) > > cancel_delayed_work_sync(&pm->ps_work); > > cancel_work_sync(&pm->wake_work); > > > > + mt792x_mutex_acquire(dev); > > mt7921_roc_abort_sync(dev); > > + mt792x_mutex_release(dev); > The next patch is removing those... > > - Felix ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 05/13] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (3 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 06/13] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac ` (7 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Fix deadlock scenarios in mt7921 ROC (Remain On Channel) abort paths: 1. Suspend path deadlock (pci.c, sdio.c): - Previous fix (b74d48c46f) added mutex around mt7921_roc_abort_sync - But roc_work acquires mutex, so cancel_work_sync can deadlock - Fix: Remove mutex wrappers since mt7921_roc_abort_sync doesn't actually need them (it only calls timer_delete_sync, cancel_work_sync, and ieee80211_iterate_interfaces which handles its own locking) 2. sta_remove path deadlock: - mt7921_mac_sta_remove is called from mt76_sta_remove which holds mutex - Calling mt7921_roc_abort_sync → cancel_work_sync can deadlock if roc_work is waiting for the mutex - Fix: Add mt7921_roc_abort_async (matching mt7925 pattern) that sets abort flag and schedules work instead of blocking - Add abort flag checking in mt7921_roc_work to handle async abort The fix mirrors the mt7925 implementation which already handles these scenarios correctly. Fixes: b74d48c46f ("wifi: mt76: mt7921: fix mutex handling in multiple paths") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7921/main.c | 29 +++++++++++++++---- .../net/wireless/mediatek/mt76/mt7921/pci.c | 2 -- .../net/wireless/mediatek/mt76/mt7921/sdio.c | 2 -- 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 196fcb1e2e94..f3941a25fd6f 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -367,17 +367,24 @@ static void mt7921_roc_iter(void *priv, u8 *mac, mt7921_mcu_abort_roc(phy, mvif, phy->roc_token_id); } +/* Async ROC abort - safe to call while holding mutex. + * Sets abort flag and schedules roc_work for cleanup. + */ +static void mt7921_roc_abort_async(struct mt792x_dev *dev) +{ + struct mt792x_phy *phy = &dev->phy; + + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + timer_delete(&phy->roc_timer); + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); +} + void mt7921_roc_abort_sync(struct mt792x_dev *dev) { struct mt792x_phy *phy = &dev->phy; timer_delete_sync(&phy->roc_timer); cancel_work_sync(&phy->roc_work); - /* Note: caller must hold mutex if ieee80211_iterate_interfaces is - * needed for ROC cleanup. Some call sites (like mt7921_mac_sta_remove) - * already hold the mutex via mt76_sta_remove(). For suspend paths, - * the mutex should be acquired before calling this function. - */ if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) ieee80211_iterate_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, @@ -392,6 +399,15 @@ void mt7921_roc_work(struct work_struct *work) phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, roc_work); + /* Check abort flag before acquiring mutex to prevent deadlock. + * Only send expired callback if ROC was actually active. + */ + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + ieee80211_remain_on_channel_expired(phy->mt76->hw); + return; + } + if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) return; @@ -888,7 +904,8 @@ void mt7921_mac_sta_remove(struct mt76_dev *mdev, struct ieee80211_vif *vif, struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_sta *msta = (struct mt792x_sta *)sta->drv_priv; - mt7921_roc_abort_sync(dev); + /* Async abort - caller already holds mutex */ + mt7921_roc_abort_async(dev); mt76_connac_free_pending_tx_skbs(&dev->pm, &msta->deflink.wcid); mt76_connac_pm_wake(&dev->mphy, &dev->pm); diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c index 9f76b334b93d..ec9686183251 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c @@ -426,9 +426,7 @@ static int mt7921_pci_suspend(struct device *device) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); - mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); - mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c index 92ea2811816f..3421e53dc948 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/sdio.c @@ -219,9 +219,7 @@ static int mt7921s_suspend(struct device *__dev) cancel_delayed_work_sync(&pm->ps_work); cancel_work_sync(&pm->wake_work); - mt792x_mutex_acquire(dev); mt7921_roc_abort_sync(dev); - mt792x_mutex_release(dev); err = mt792x_mcu_drv_pmctrl(dev); if (err < 0) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 06/13] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (4 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 05/13] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 08/13] wifi: mt76: mt7925: add MCU command error handling Zac ` (6 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Add NULL pointer checks for functions that return pointers to link-related structures throughout the mt7925 driver. During MLO state transitions, these functions can return NULL when link configuration is not synchronized. Functions protected: - mt792x_vif_to_bss_conf(): Returns link BSS configuration - mt792x_vif_to_link(): Returns driver link state - mt792x_sta_to_link(): Returns station link state Files updated: 1. mac.c: - mt7925_vif_connect_iter(): Check bss_conf before use - mt7925_mac_sta_assoc(): Check bss_conf before use 2. main.c: - mt7925_set_key(): Check link_conf and mlink - mt7925_mac_link_sta_add(): Check link_conf and mlink - mt7925_mac_link_sta_assoc(): Check bss_conf and mlink - mt7925_mac_link_sta_remove(): Check bss_conf and mlink - mt7925_change_vif_links(): Check conf before use - mt7925_assign_vif_chanctx(): Check mconf and mlink - mt7925_unassign_vif_chanctx(): Check mconf and mlink - mt7925_mgd_prepare_tx(): Check link_conf 3. mcu.c: - mt7925_mcu_sta_phy_tlv(): Check link_sta - mt7925_mcu_sta_amsdu_tlv(): Check link_sta - mt7925_mcu_sta_mld_tlv(): Check link_sta - mt7925_mcu_sta_cmd(): Check mlink - mt7925_mcu_add_bss_info(): Check link_conf - mt7925_mcu_set_chctx(): Check link_conf and mlink Prevents crashes during: - BSSID roaming transitions - MLO setup and teardown - Hardware reset operations - Runtime power management Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/mac.c | 6 ++ .../net/wireless/mediatek/mt76/mt7925/main.c | 82 ++++++++++++++++--- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 22 ++++- 3 files changed, 97 insertions(+), 13 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index 871b67101976..184efe8afa10 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1271,6 +1271,12 @@ mt7925_vif_connect_iter(void *priv, u8 *mac, bss_conf = mt792x_vif_to_bss_conf(vif, i); mconf = mt792x_vif_to_link(mvif, i); + /* Skip links that don't have bss_conf set up yet in mac80211. + * This can happen during HW reset when link state is inconsistent. + */ + if (!bss_conf) + continue; + mt76_connac_mcu_uni_add_dev(&dev->mphy, bss_conf, &mconf->mt76, &mvif->sta.deflink.wcid, true); mt7925_mcu_set_tx(dev, bss_conf); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 05990455ee7d..74a48742e234 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -608,6 +608,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) + return -EINVAL; + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -860,12 +864,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -891,6 +900,8 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) + return -EINVAL; /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -997,18 +1008,29 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) { struct mt792x_dev *dev = container_of(mdev, struct mt792x_dev, mt76); struct mt792x_vif *mvif = (struct mt792x_vif *)vif->drv_priv; - struct ieee80211_bss_conf *link_conf = - mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - struct cfg80211_chan_def *chandef = &link_conf->chanreq.oper; - enum nl80211_band band = chandef->chan->band, secondary_band; + struct ieee80211_bss_conf *link_conf; + struct cfg80211_chan_def *chandef; + enum nl80211_band band, secondary_band; + u16 sel_links; + u8 secondary_link_id; - u16 sel_links = mt76_select_links(vif, 2); - u8 secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); + link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); + if (!link_conf) + return; + + chandef = &link_conf->chanreq.oper; + band = chandef->chan->band; + + sel_links = mt76_select_links(vif, 2); + secondary_link_id = __ffs(~BIT(mvif->deflink_id) & sel_links); if (!ieee80211_vif_is_mld(vif) || hweight16(sel_links) < 2) return; link_conf = mt792x_vif_to_bss_conf(vif, secondary_link_id); + if (!link_conf) + return; + secondary_band = link_conf->chanreq.oper.chan->band; if (band == NL80211_BAND_2GHZ || @@ -1036,6 +1058,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1045,12 +1069,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1097,6 +1122,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1110,10 +1137,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1121,6 +1150,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) @@ -1308,6 +1338,8 @@ mt7925_mlo_pm_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) mt792x_mutex_acquire(dev); for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } mt792x_mutex_release(dev); @@ -1634,6 +1666,8 @@ static void mt7925_ipv6_addr_change(struct ieee80211_hw *hw, for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; __mt7925_ipv6_addr_change(hw, bss_conf, idev); } } @@ -1695,6 +1729,9 @@ mt7925_conf_tx(struct ieee80211_hw *hw, struct ieee80211_vif *vif, [IEEE80211_AC_BK] = 1, }; + if (!mconf) + return -EINVAL; + /* firmware uses access class index */ mconf->queue_params[mq_to_aci[queue]] = *params; @@ -1865,6 +1902,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ARP_FILTER) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_update_arp_filter(&dev->mt76, bss_conf); } } @@ -1880,6 +1919,8 @@ static void mt7925_vif_cfg_changed(struct ieee80211_hw *hw, } else if (mvif->mlo_pm_state == MT792x_MLO_CHANGED_PS) { for_each_set_bit(i, &valid, IEEE80211_MLD_MAX_NUM_LINKS) { bss_conf = mt792x_vif_to_bss_conf(vif, i); + if (!bss_conf) + continue; mt7925_mcu_uni_bss_ps(dev, bss_conf); } } @@ -1901,7 +1942,12 @@ static void mt7925_link_info_changed(struct ieee80211_hw *hw, struct ieee80211_bss_conf *link_conf; mconf = mt792x_vif_to_link(mvif, info->link_id); + if (!mconf) + return; + link_conf = mt792x_vif_to_bss_conf(vif, mconf->link_id); + if (!link_conf) + return; mt792x_mutex_acquire(dev); @@ -2025,6 +2071,11 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, mlink = mlinks[link_id]; link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + err = -EINVAL; + goto free; + } + rcu_assign_pointer(mvif->link_conf[link_id], mconf); rcu_assign_pointer(mvif->sta.link[link_id], mlink); @@ -2105,9 +2156,14 @@ static int mt7925_assign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return -EINVAL; + } + pri_link_conf = mt792x_vif_to_bss_conf(vif, mvif->deflink_id); - if (vif->type == NL80211_IFTYPE_STATION && + if (pri_link_conf && vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) mt7925_mcu_add_bss_info(&dev->phy, NULL, pri_link_conf, NULL, true); @@ -2136,6 +2192,10 @@ static void mt7925_unassign_vif_chanctx(struct ieee80211_hw *hw, if (ieee80211_vif_is_mld(vif)) { mconf = mt792x_vif_to_link(mvif, link_conf->link_id); + if (!mconf) { + mutex_unlock(&dev->mt76.mutex); + return; + } if (vif->type == NL80211_IFTYPE_STATION && mconf == &mvif->bss_conf) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index cf0fdea45cf7..94ec62a4538a 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1087,6 +1087,8 @@ mt7925_mcu_sta_hdr_trans_tlv(struct sk_buff *skb, struct mt792x_link_sta *mlink; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; wcid = &mlink->wcid; } else { wcid = &mvif->sta.deflink.wcid; @@ -1120,6 +1122,9 @@ int mt7925_mcu_wtbl_update_hdr_trans(struct mt792x_dev *dev, link_sta = mt792x_sta_to_link_sta(vif, sta, link_id); mconf = mt792x_vif_to_link(mvif, link_id); + if (!mlink || !mconf) + return -EINVAL; + skb = __mt76_connac_mcu_alloc_sta_req(&dev->mt76, &mconf->mt76, &mlink->wcid, MT7925_STA_UPDATE_MAX_SIZE); @@ -1741,6 +1746,8 @@ mt7925_mcu_sta_amsdu_tlv(struct sk_buff *skb, amsdu->amsdu_en = true; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mlink->wcid.amsdu = true; switch (link_sta->agg.max_amsdu_len) { @@ -1773,6 +1780,10 @@ mt7925_mcu_sta_phy_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; @@ -1851,6 +1862,10 @@ mt7925_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, link_conf = mt792x_vif_to_bss_conf(vif, link_sta->link_id); mconf = mt792x_vif_to_link(mvif, link_sta->link_id); + + if (!link_conf || !mconf) + return; + chandef = mconf->mt76.ctx ? &mconf->mt76.ctx->def : &link_conf->chanreq.oper; band = chandef->chan->band; @@ -1935,6 +1950,9 @@ mt7925_mcu_sta_mld_tlv(struct sk_buff *skb, mconf = mt792x_vif_to_link(mvif, i); mlink = mt792x_sta_to_link(msta, i); + if (!mconf || !mlink) + continue; + mld->link[cnt].wlan_id = cpu_to_le16(mlink->wcid.idx); mld->link[cnt++].bss_idx = mconf->mt76.idx; @@ -2027,13 +2045,13 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 08/13] wifi: mt76: mt7925: add MCU command error handling 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (5 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 06/13] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 09/13] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac ` (5 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Add proper error handling for MCU command return values that were previously being ignored. Without proper error handling, failures in MCU communication can leave the driver in an inconsistent state. Functions updated: 1. main.c: mt7925_ampdu_action() - BA session setup - Check mt7925_mcu_uni_tx_ba() return value - Check mt7925_mcu_uni_rx_ba() return value - Return error to mac80211 on failure 2. main.c: mt7925_mac_link_sta_add() - Station addition - Check mt7925_mcu_add_bss_info() return value - Propagate errors during station setup 3. main.c: mt7925_set_key() - Key installation - Check mt7925_mcu_add_bss_info() return value when setting BSS info before key installation - Prevent key setup on communication failure These changes ensure that MCU communication failures are properly detected and reported to mac80211, allowing proper error recovery instead of leaving the driver in an undefined state. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 30 +++++++++++-------- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index f1884944f77d..59a5b22a6ed6 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -641,8 +641,10 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, struct mt792x_phy *phy = mt792x_hw_phy(hw); mconf->mt76.cipher = mt7925_mcu_get_cipher(key->cipher); - mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, - link_sta, true); + err = mt7925_mcu_add_bss_info(phy, mconf->mt76.ctx, link_conf, + link_sta, true); + if (err) + goto out; } if (cmd == SET_KEY) @@ -908,11 +910,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + return ret; } if (ieee80211_vif_is_mld(vif) && @@ -1291,22 +1296,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1315,8 +1320,9 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); + if (!ret) + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 09/13] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (6 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 08/13] wifi: mt76: mt7925: add MCU command error handling Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 10/13] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac ` (4 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. The lockdep assertions will trigger warnings at runtime if code paths exist that call these functions without proper mutex protection. This aids in detecting the class of bugs fixed by patches in this series. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 94ec62a4538a..1c58b0be2be4 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1532,6 +1532,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2047,6 +2049,8 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, struct mt792x_sta *msta; struct mt792x_link_sta *mlink = NULL; + lockdep_assert_held(&dev->mt76.mutex); + if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); @@ -2853,6 +2857,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 10/13] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (7 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 09/13] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 11/13] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac ` (3 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Fix two issues related to MLO roaming and remain-on-channel operations: 1. Key removal failure during MLO roaming: During MLO roaming, key removal can fail because the WCID (wireless client ID) is already cleaned up before the key removal operation completes. When roaming between APs in an MLO setup: - mac80211 triggers sta_state changes - mt7925_mac_link_sta_remove() is called for the old link - WCID is cleared via mt76_wcid_cleanup() - Later, key removal MCU command uses the now-invalid WCID Fix by checking if the WCID is still valid before sending key removal commands to firmware. If the WCID has already been cleaned up, skip the MCU command since the firmware has already removed the keys. 2. Kernel warning in MLO ROC setup: When starting a remain-on-channel operation in MLO mode, the driver passes incorrect parameters to mt7925_mcu_set_roc(), causing a kernel warning about invalid chanctx usage. Fix by checking for valid chanctx and link configuration before setting up ROC, and use the correct link_id from the vif when available. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 9 ++++++++- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 20 +++++++++++++------ 2 files changed, 22 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 59a5b22a6ed6..7d68b08f445a 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -609,8 +609,15 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); - if (!link_conf || !mconf || !mlink) + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; return -EINVAL; + } wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 1c58b0be2be4..6f7fc1b9a440 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1342,15 +1342,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1364,8 +1372,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_bss_conf *mconf, u16 sel_links, type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 11/13] wifi: mt76: mt7925: fix BA session teardown during beacon loss 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (8 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 10/13] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac @ 2026-01-20 20:10 ` Zac 2026-01-20 20:10 ` [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac ` (2 subsequent siblings) 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> The ieee80211_stop_tx_ba_cb_irqsafe() callback was conditionally called only when the MCU command succeeded. However, during beacon connection loss, the MCU command may fail because the AP is no longer reachable. If the callback is not called, mac80211's BA session state machine gets stuck in an intermediate state. When mac80211 later tries to tear down all BA sessions during disconnection, it hits a WARN in __ieee80211_stop_tx_ba_session() due to the inconsistent state. Fix by making the callback unconditional, matching the behavior of mt7921 and mt7996 drivers. The MCU command failure is acceptable during disconnection - what matters is that mac80211 is notified to complete the session teardown. Reported-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 7d68b08f445a..82c81c22e39c 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1327,9 +1327,13 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - ret = mt7925_mcu_uni_tx_ba(dev, params, false); - if (!ret) - ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); + /* MCU command may fail during beacon loss, but callback must + * always be called to complete the BA session teardown in + * mac80211. Otherwise the state machine gets stuck and triggers + * WARN in __ieee80211_stop_tx_ba_session(). + */ + mt7925_mcu_uni_tx_ba(dev, params, false); + ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; } mt792x_mutex_release(dev); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (9 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 11/13] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac @ 2026-01-20 20:10 ` Zac 2026-01-27 11:06 ` Felix Fietkau 2026-01-20 20:10 ` [PATCH 13/13] wifi: mt76: mt7925: fix double wcid initialization race condition Zac 2026-01-27 10:58 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions Felix Fietkau 12 siblings, 1 reply; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> Fix multiple interrelated issues in the remain-on-channel (ROC) handling that cause deadlocks, race conditions, and resource leaks. Problems fixed: 1. Deadlock in sta removal ROC abort path: When a station is removed while a ROC operation is in progress, the driver would call mt7925_roc_abort_sync() which waits for ROC completion. However, the ROC work itself needs to acquire mt792x_mutex which is already held during station removal, causing a deadlock. Fix: Use async ROC abort (mt76_connac_mcu_abort_roc) when called from paths that already hold the mutex, and add MT76_STATE_ROC_ABORT flag to coordinate between the abort and the ROC timer. 2. ROC timer race during suspend: The ROC timer could fire after the device started suspending but before the ROC was properly aborted, causing undefined behavior. Fix: Delete ROC timer synchronously before suspend and check device state before processing ROC timeout. 3. ROC rate limiting for MLO auth failures: Rapid ROC requests during MLO authentication can overwhelm the firmware, causing authentication timeouts. The MT7925 firmware has limited ROC handling capacity. Fix: Add rate limiting infrastructure with configurable minimum interval between ROC requests. Track last ROC completion time and defer new requests if they arrive too quickly. 4. WCID leak in ROC cleanup: When ROC operations are aborted, the associated WCID resources were not being properly released, causing resource exhaustion over time. Fix: Ensure WCID cleanup happens in all ROC termination paths. 5. Async ROC abort race condition: The async ROC abort could race with normal ROC completion, causing double-free or use-after-free of ROC resources. Fix: Use MT76_STATE_ROC_ABORT flag and proper synchronization to prevent races between async abort and normal completion paths. These fixes work together to provide robust ROC handling that doesn't deadlock, properly releases resources, and handles edge cases during suspend and MLO operations. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt76.h | 1 + .../net/wireless/mediatek/mt76/mt7925/main.c | 175 ++++++++++++++++-- drivers/net/wireless/mediatek/mt76/mt792x.h | 7 + 3 files changed, 171 insertions(+), 12 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h index d05e83ea1cac..91f9dd95c89e 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76.h +++ b/drivers/net/wireless/mediatek/mt76/mt76.h @@ -511,6 +511,7 @@ enum { MT76_STATE_POWER_OFF, MT76_STATE_SUSPEND, MT76_STATE_ROC, + MT76_STATE_ROC_ABORT, MT76_STATE_PM, MT76_STATE_WED_RESET, }; diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 82c81c22e39c..4b7c13485497 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -453,6 +453,24 @@ static void mt7925_roc_iter(void *priv, u8 *mac, mt7925_mcu_abort_roc(phy, &mvif->bss_conf, phy->roc_token_id); } +/* Async ROC abort - safe to call while holding mutex. + * Sets abort flag and lets roc_work handle cleanup without blocking. + * This prevents deadlock when called from sta_remove path which holds mutex. + */ +static void mt7925_roc_abort_async(struct mt792x_dev *dev) +{ + struct mt792x_phy *phy = &dev->phy; + + /* Set abort flag - roc_work checks this before acquiring mutex */ + set_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + + /* Stop timer and schedule work to handle cleanup. + * Must schedule work since timer may not have fired yet. + */ + timer_delete(&phy->roc_timer); + ieee80211_queue_work(phy->mt76->hw, &phy->roc_work); +} + void mt7925_roc_abort_sync(struct mt792x_dev *dev) { struct mt792x_phy *phy = &dev->phy; @@ -477,6 +495,17 @@ void mt7925_roc_work(struct work_struct *work) phy = (struct mt792x_phy *)container_of(work, struct mt792x_phy, roc_work); + /* Check abort flag BEFORE acquiring mutex to prevent deadlock. + * If abort is requested while we're in the sta_remove path (which + * holds the mutex), we must not try to acquire it or we'll deadlock. + * Clear the flags and only notify mac80211 if ROC was actually active. + */ + if (test_and_clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state)) { + if (test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) + ieee80211_remain_on_channel_expired(phy->mt76->hw); + return; + } + if (!test_and_clear_bit(MT76_STATE_ROC, &phy->mt76->state)) return; @@ -504,14 +533,93 @@ static int mt7925_abort_roc(struct mt792x_phy *phy, return err; } +/* ROC rate limiting constants - exponential backoff to prevent MCU overload + * when upper layers trigger rapid reconnection cycles (e.g., MLO auth failures). + * Max backoff ~1.6s, resets after 10s of no timeouts. + */ +#define MT7925_ROC_BACKOFF_BASE_MS 100 +#define MT7925_ROC_BACKOFF_MAX_MS 1600 +#define MT7925_ROC_TIMEOUT_RESET_MS 10000 +#define MT7925_ROC_TIMEOUT_WARN_THRESH 5 + +/* Check if ROC should be throttled due to recent timeouts. + * Returns delay in jiffies if throttling, 0 if OK to proceed. + */ +static unsigned long mt7925_roc_throttle_check(struct mt792x_phy *phy) +{ + unsigned long now = jiffies; + + /* Reset timeout counter if it's been a while since last timeout */ + if (phy->roc_timeout_count && + time_after(now, phy->roc_last_timeout + + msecs_to_jiffies(MT7925_ROC_TIMEOUT_RESET_MS))) { + phy->roc_timeout_count = 0; + phy->roc_backoff_until = 0; + } + + /* Check if we're still in backoff period */ + if (phy->roc_backoff_until && time_before(now, phy->roc_backoff_until)) + return phy->roc_backoff_until - now; + + return 0; +} + +/* Record ROC timeout and calculate backoff period */ +static void mt7925_roc_record_timeout(struct mt792x_phy *phy) +{ + unsigned int backoff_ms; + + phy->roc_last_timeout = jiffies; + phy->roc_timeout_count++; + + /* Exponential backoff: 100ms, 200ms, 400ms, 800ms, 1600ms (capped) */ + backoff_ms = MT7925_ROC_BACKOFF_BASE_MS << + min_t(u8, phy->roc_timeout_count - 1, 4); + if (backoff_ms > MT7925_ROC_BACKOFF_MAX_MS) + backoff_ms = MT7925_ROC_BACKOFF_MAX_MS; + + phy->roc_backoff_until = jiffies + msecs_to_jiffies(backoff_ms); + + /* Warn if we're seeing repeated timeouts - likely upper layer issue */ + if (phy->roc_timeout_count == MT7925_ROC_TIMEOUT_WARN_THRESH) + dev_warn(phy->dev->mt76.dev, + "mt7925: %u consecutive ROC timeouts, possible mac80211/wpa_supplicant issue (MLO key race?)\n", + phy->roc_timeout_count); +} + +/* Clear timeout tracking on successful ROC */ +static void mt7925_roc_clear_timeout(struct mt792x_phy *phy) +{ + phy->roc_timeout_count = 0; + phy->roc_backoff_until = 0; +} + static int mt7925_set_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf, struct ieee80211_channel *chan, int duration, enum mt7925_roc_req type) { + unsigned long throttle; int err; + /* Check rate limiting - if in backoff period, wait or return busy */ + throttle = mt7925_roc_throttle_check(phy); + if (throttle) { + /* For short backoffs, wait; for longer ones, return busy */ + if (throttle < msecs_to_jiffies(200)) { + msleep(jiffies_to_msecs(throttle)); + } else { + dev_dbg(phy->dev->mt76.dev, + "mt7925: ROC throttled, %u ms remaining\n", + jiffies_to_msecs(throttle)); + return -EBUSY; + } + } + + /* Clear stale abort flag from previous ROC */ + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) return -EBUSY; @@ -527,7 +635,11 @@ static int mt7925_set_roc(struct mt792x_phy *phy, if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); clear_bit(MT76_STATE_ROC, &phy->mt76->state); + mt7925_roc_record_timeout(phy); err = -ETIMEDOUT; + } else { + /* Successful ROC - reset timeout tracking */ + mt7925_roc_clear_timeout(phy); } out: @@ -538,8 +650,27 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf, u16 sel_links) { + unsigned long throttle; int err; + /* Check rate limiting - MLO ROC is especially prone to rapid-fire + * during reconnection cycles after MLO authentication failures. + */ + throttle = mt7925_roc_throttle_check(phy); + if (throttle) { + if (throttle < msecs_to_jiffies(200)) { + msleep(jiffies_to_msecs(throttle)); + } else { + dev_dbg(phy->dev->mt76.dev, + "mt7925: MLO ROC throttled, %u ms remaining\n", + jiffies_to_msecs(throttle)); + return -EBUSY; + } + } + + /* Clear stale abort flag from previous ROC */ + clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); + if (WARN_ON_ONCE(test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state))) return -EBUSY; @@ -554,7 +685,10 @@ static int mt7925_set_mlo_roc(struct mt792x_phy *phy, if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); clear_bit(MT76_STATE_ROC, &phy->mt76->state); + mt7925_roc_record_timeout(phy); err = -ETIMEDOUT; + } else { + mt7925_roc_clear_timeout(phy); } out: @@ -571,6 +705,7 @@ static int mt7925_remain_on_channel(struct ieee80211_hw *hw, struct mt792x_phy *phy = mt792x_hw_phy(hw); int err; + cancel_work_sync(&phy->roc_work); mt792x_mutex_acquire(phy->dev); err = mt7925_set_roc(phy, &mvif->bss_conf, chan, duration, MT7925_ROC_REQ_ROC); @@ -878,14 +1013,14 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, if (!mlink) return -EINVAL; - idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); - if (idx < 0) - return -ENOSPC; - mconf = mt792x_vif_to_link(mvif, link_id); if (!mconf) return -EINVAL; + idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); + if (idx < 0) + return -ENOSPC; + mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -905,14 +1040,16 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); if (ret) - return ret; + goto err_wcid; mt7925_mac_wtbl_update(dev, idx, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (!link_conf) - return -EINVAL; + if (!link_conf) { + ret = -EINVAL; + goto err_wcid; + } /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { @@ -924,7 +1061,7 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); if (ret) - return ret; + goto err_wcid; } if (ieee80211_vif_is_mld(vif) && @@ -932,28 +1069,34 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } else if (ieee80211_vif_is_mld(vif) && link_sta != mlink->pri_link) { ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; } else { ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } mt76_connac_power_save_sched(&dev->mphy, &dev->pm); return 0; + +err_wcid: + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); + return ret; } static int @@ -1139,6 +1282,9 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, if (!mlink) return; + /* With Sean's fix, roc_abort_sync uses cancel_work() instead of + * cancel_work_sync(), so it's safe to call even with mutex held. + */ mt7925_roc_abort_sync(dev); mt76_connac_free_pending_tx_skbs(&dev->pm, &mlink->wcid); @@ -1534,6 +1680,8 @@ static int mt7925_suspend(struct ieee80211_hw *hw, cancel_delayed_work_sync(&dev->pm.ps_work); mt76_connac_free_pending_tx_skbs(&dev->pm, NULL); + /* Cancel ROC before quiescing starts */ + mt7925_roc_abort_sync(dev); mt792x_mutex_acquire(dev); clear_bit(MT76_STATE_RUNNING, &phy->mt76->state); @@ -1880,6 +2028,8 @@ static void mt7925_mgd_prepare_tx(struct ieee80211_hw *hw, u16 duration = info->duration ? info->duration : jiffies_to_msecs(HZ); + cancel_work_sync(&mvif->phy->roc_work); + mt792x_mutex_acquire(dev); mt7925_set_roc(mvif->phy, &mvif->bss_conf, mvif->bss_conf.mt76.ctx->def.chan, duration, @@ -2037,6 +2187,7 @@ mt7925_change_vif_links(struct ieee80211_hw *hw, struct ieee80211_vif *vif, if (old_links == new_links) return 0; + cancel_work_sync(&phy->roc_work); mt792x_mutex_acquire(dev); for_each_set_bit(link_id, &rem, IEEE80211_MLD_MAX_NUM_LINKS) { diff --git a/drivers/net/wireless/mediatek/mt76/mt792x.h b/drivers/net/wireless/mediatek/mt76/mt792x.h index 8388638ed550..d9c1ea709390 100644 --- a/drivers/net/wireless/mediatek/mt76/mt792x.h +++ b/drivers/net/wireless/mediatek/mt76/mt792x.h @@ -186,6 +186,13 @@ struct mt792x_phy { wait_queue_head_t roc_wait; u8 roc_token_id; bool roc_grant; + + /* ROC rate limiting to prevent MCU overload during rapid reconnection + * cycles (e.g., MLO authentication failures causing repeated ROC). + */ + u8 roc_timeout_count; /* consecutive ROC timeouts */ + unsigned long roc_last_timeout; /* jiffies of last timeout */ + unsigned long roc_backoff_until;/* don't issue ROC until this time */ }; struct mt792x_irq_map { -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 20:10 ` [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac @ 2026-01-27 11:06 ` Felix Fietkau 0 siblings, 0 replies; 113+ messages in thread From: Felix Fietkau @ 2026-01-27 11:06 UTC (permalink / raw) To: Zac, sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, zbowling On 20.01.26 21:10, Zac wrote: > From: Zac Bowling <zac@zacbowling.com> > > Fix multiple interrelated issues in the remain-on-channel (ROC) handling > that cause deadlocks, race conditions, and resource leaks. > > Problems fixed: > > 1. Deadlock in sta removal ROC abort path: > When a station is removed while a ROC operation is in progress, the > driver would call mt7925_roc_abort_sync() which waits for ROC completion. > However, the ROC work itself needs to acquire mt792x_mutex which is > already held during station removal, causing a deadlock. > > Fix: Use async ROC abort (mt76_connac_mcu_abort_roc) when called from > paths that already hold the mutex, and add MT76_STATE_ROC_ABORT flag > to coordinate between the abort and the ROC timer. > > 2. ROC timer race during suspend: > The ROC timer could fire after the device started suspending but before > the ROC was properly aborted, causing undefined behavior. > > Fix: Delete ROC timer synchronously before suspend and check device > state before processing ROC timeout. > > 3. ROC rate limiting for MLO auth failures: > Rapid ROC requests during MLO authentication can overwhelm the firmware, > causing authentication timeouts. The MT7925 firmware has limited ROC > handling capacity. > > Fix: Add rate limiting infrastructure with configurable minimum interval > between ROC requests. Track last ROC completion time and defer new > requests if they arrive too quickly. > > 4. WCID leak in ROC cleanup: > When ROC operations are aborted, the associated WCID resources were > not being properly released, causing resource exhaustion over time. > > Fix: Ensure WCID cleanup happens in all ROC termination paths. > > 5. Async ROC abort race condition: > The async ROC abort could race with normal ROC completion, causing > double-free or use-after-free of ROC resources. > > Fix: Use MT76_STATE_ROC_ABORT flag and proper synchronization to > prevent races between async abort and normal completion paths. > > These fixes work together to provide robust ROC handling that doesn't > deadlock, properly releases resources, and handles edge cases during > suspend and MLO operations. > > Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") > Signed-off-by: Zac Bowling <zac@zacbowling.com> The rate limiting code seems a bit suspicious to me. What does "limited ROC handling capacity" mean? Outstanding ROC requests? Does it need time to settle after a completed ROC? This needs to be clarified and likely replaced with a more targeted fix. - Felix ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH 13/13] wifi: mt76: mt7925: fix double wcid initialization race condition 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (10 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac @ 2026-01-20 20:10 ` Zac 2026-01-27 10:58 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions Felix Fietkau 12 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-20 20:10 UTC (permalink / raw) To: sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, zac, zbowling Remove duplicate mt76_wcid_init() call in mt7925_mac_link_sta_add that occurs after the wcid is already published via rcu_assign_pointer(). The wcid is correctly initialized at line 1023 after allocation. However, a second mt76_wcid_init() call at line 1036 reinitializes the wcid after it has been published to RCU readers, which can cause: - List head corruption (tx_list, poll_list) if concurrent code is already using the wcid - Memory leaks from reinitializing the pktid IDR - Race conditions where readers see partially initialized state This appears to be a refactoring error where the duplicate call was left behind. Fixes: TBD ("wifi: mt76: mt7925: add MLO support") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 4b7c13485497..acce21ad3a29 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1033,7 +1033,6 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, wcid = &mlink->wcid; ewma_signal_init(&wcid->rssi); rcu_assign_pointer(dev->mt76.wcid[wcid->idx], wcid); - mt76_wcid_init(wcid, 0); ewma_avg_signal_init(&mlink->avg_ack_signal); memset(mlink->airtime_ac, 0, sizeof(msta->deflink.airtime_ac)); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac ` (11 preceding siblings ...) 2026-01-20 20:10 ` [PATCH 13/13] wifi: mt76: mt7925: fix double wcid initialization race condition Zac @ 2026-01-27 10:58 ` Felix Fietkau 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac 12 siblings, 1 reply; 113+ messages in thread From: Felix Fietkau @ 2026-01-27 10:58 UTC (permalink / raw) To: Zac, sean.wang Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, zbowling On 20.01.26 21:10, Zac wrote: > From: Zac Bowling <zac@zacbowling.com> > > TLDR: This series addresses stability issues in both the MT7921 and MT7925 > WiFi drivers that cause kernel panics, deadlocks, and system hangs > on various systems using these drivers. > > This v6 series is rebased on Sean Wang's upstream deadlock fix already sent > which is now included as patch 01/13. The remaining 12 patches are my stability > fixes. When you send v7, please include the "v7" in the subject for all patches, instead of just the cover letter. Working through your patches in patchwork is getting quite confusing... - Felix ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes 2026-01-27 10:58 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions Felix Fietkau @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 1/6] wifi: mt76: mt7925: fix double wcid initialization race condition Zac ` (6 more replies) 0 siblings, 7 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling From: Zac Bowling <zac@zacbowling.com> This patch series addresses several stability issues in the mt7925 driver, particularly around Multi-Link Operation (MLO) scenarios. These fixes address kernel panics, deadlocks, and race conditions reported by users on systems like Framework laptops with MT7925 WiFi adapters. Changes since v6: - Consolidated from 12 patches to 6 focused patches - Removed patches that have been merged or superseded upstream - Improved error handling in AMPDU actions - Added lockdep assertions for better debugging The series addresses: 1. Double wcid initialization race condition during station add 2. NULL pointer dereferences during MLO state transitions 3. Missing mutex protection in critical paths 4. MCU command error handling in AMPDU BA session management 5. Lockdep assertions for mutex verification 6. MLO ROC setup error handling Tested on: - Framework Laptop 16 with MT7925 (AMD variant) - Kernel 6.18.x and nbd168/wireless mt76 branch - Various MLO and non-MLO AP configurations Zac Bowling (6): wifi: mt76: mt7925: fix double wcid initialization race condition wifi: mt76: mt7925: add NULL pointer protection for MLO operations wifi: mt76: mt7925: add mutex protection in critical paths wifi: mt76: mt7925: add MCU command error handling in ampdu_action wifi: mt76: mt7925: add lockdep assertions for mutex verification wifi: mt76: mt7925: fix MLO ROC setup error handling drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 3 ++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 65 +++++++++++++++++++----- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 24 +++++++-- 3 files changed, 75 insertions(+), 17 deletions(-) -- 2.52.0 ^ permalink raw reply [flat|nested] 113+ messages in thread
* [PATCH v7 1/6] wifi: mt76: mt7925: fix double wcid initialization race condition 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac ` (5 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Remove duplicate mt76_wcid_init() call in mt7925_mac_link_sta_add that occurs after the wcid is already published via rcu_assign_pointer(). The wcid is correctly initialized at line 873 after allocation. However, a second mt76_wcid_init() call at line 885 reinitializes the wcid after it has been published to RCU readers, which can cause: - List head corruption (tx_list, poll_list) if concurrent code is already using the wcid - Memory leaks from reinitializing the pktid IDR - Race conditions where readers see partially initialized state Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index afcc0fa4aa35..fad3b1505f67 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -882,7 +882,6 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, wcid = &mlink->wcid; ewma_signal_init(&wcid->rssi); rcu_assign_pointer(dev->mt76.wcid[wcid->idx], wcid); - mt76_wcid_init(wcid, 0); ewma_avg_signal_init(&mlink->avg_ack_signal); memset(mlink->airtime_ac, 0, sizeof(msta->deflink.airtime_ac)); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v7 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac 2026-01-29 8:18 ` [PATCH v7 1/6] wifi: mt76: mt7925: fix double wcid initialization race condition Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 3/6] wifi: mt76: mt7925: add mutex protection in critical paths Zac ` (4 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Add NULL pointer checks for functions that return pointers to link-related structures throughout the mt7925 driver. During MLO state transitions, these functions can return NULL when link configuration is not synchronized. Functions protected: - mt792x_vif_to_bss_conf(): Returns link BSS configuration - mt792x_vif_to_link(): Returns driver link state - mt792x_sta_to_link(): Returns station link state Key changes: 1. mt7925_set_link_key(): - Check link_conf, mconf, mlink before use - During MLO roaming, allow key removal to succeed if link is already gone 2. mt7925_mac_link_sta_add(): - Check mlink and mconf before WCID allocation - Check link_conf before BSS info update - Add proper WCID cleanup on error paths (err_wcid label) - Check MCU return values and propagate errors 3. mt7925_mac_link_sta_assoc(): - Check mlink before use - Check link_conf and mconf before BSS info update 4. mt7925_mac_link_sta_remove(): - Check mlink before use - Check link_conf and mconf before cleanup operations Prevents crashes during: - BSSID roaming transitions - MLO setup and teardown - Hardware reset operations Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 66 ++++++++++++++----- 1 file changed, 51 insertions(+), 15 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index fad3b1505f67..88ee90709b75 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -612,6 +612,17 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; + return -EINVAL; + } + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -864,12 +875,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; + + mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; - mconf = mt792x_vif_to_link(mvif, link_id); mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -888,21 +904,28 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); if (ret) - return ret; + goto err_wcid; mt7925_mac_wtbl_update(dev, idx, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + ret = -EINVAL; + goto err_wcid; + } /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + goto err_wcid; } if (ieee80211_vif_is_mld(vif) && @@ -910,28 +933,34 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } else if (ieee80211_vif_is_mld(vif) && link_sta != mlink->pri_link) { ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; } else { ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } mt76_connac_power_save_sched(&dev->mphy, &dev->pm); return 0; + +err_wcid: + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); + return ret; } static int @@ -1039,6 +1068,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1048,12 +1079,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1100,6 +1132,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1113,10 +1147,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1124,7 +1160,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } - +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) list_del_init(&mlink->wcid.poll_list); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v7 3/6] wifi: mt76: mt7925: add mutex protection in critical paths 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac 2026-01-29 8:18 ` [PATCH v7 1/6] wifi: mt76: mt7925: fix double wcid initialization race condition Zac 2026-01-29 8:18 ` [PATCH v7 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 4/6] wifi: mt76: mt7925: add MCU command error handling in ampdu_action Zac ` (3 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Add proper mutex protection for mt7925 driver operations that access hardware state without proper synchronization. This fixes race conditions that can cause system instability during power management and recovery. Fixes: 1. mac.c: mt7925_mac_reset_work() - Wrap ieee80211_iterate_active_interfaces() with mt792x_mutex - The vif_connect_iter callback accesses hardware state 2. main.c: mt7925_set_runtime_pm() - Add mutex protection around ieee80211_iterate_active_interfaces() - Runtime PM can race with other operations These protections ensure consistent hardware state access during power management transitions and recovery operations. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mac.c | 2 ++ drivers/net/wireless/mediatek/mt76/mt7925/main.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c index f1f0bc9eab04..88cf214ab452 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mac.c @@ -1330,9 +1330,11 @@ void mt7925_mac_reset_work(struct work_struct *work) dev->hw_full_reset = false; pm->suspended = false; ieee80211_wake_queues(hw); + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_vif_connect_iter, NULL); + mt792x_mutex_release(dev); mt76_connac_power_save_sched(&dev->mt76.phy, pm); mt7925_regd_change(&dev->phy, "00"); diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 88ee90709b75..82de6f30ec27 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -770,9 +770,11 @@ void mt7925_set_runtime_pm(struct mt792x_dev *dev) bool monitor = !!(hw->conf.flags & IEEE80211_CONF_MONITOR); pm->enable = pm->enable_user && !monitor; + mt792x_mutex_acquire(dev); ieee80211_iterate_active_interfaces(hw, IEEE80211_IFACE_ITER_RESUME_ALL, mt7925_pm_interface_iter, dev); + mt792x_mutex_release(dev); pm->ds_enable = pm->ds_enable_user && !monitor; mt7925_mcu_set_deep_sleep(dev, pm->ds_enable); } -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v7 4/6] wifi: mt76: mt7925: add MCU command error handling in ampdu_action 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac ` (2 preceding siblings ...) 2026-01-29 8:18 ` [PATCH v7 3/6] wifi: mt76: mt7925: add mutex protection in critical paths Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 5/6] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac ` (2 subsequent siblings) 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Add proper error handling for MCU command return values that were previously being ignored. Without proper error handling, failures in MCU communication can leave the driver in an inconsistent state. Changes: - Check mt7925_mcu_uni_tx_ba() return value - Check mt7925_mcu_uni_rx_ba() return value - Return error to mac80211 on failure Special case for IEEE80211_AMPDU_TX_STOP_CONT: The ieee80211_stop_tx_ba_cb_irqsafe() callback is kept unconditional because during beacon loss, the MCU command may fail but mac80211 MUST be notified to complete the BA session teardown. Otherwise the state machine gets stuck and triggers WARN in __ieee80211_stop_tx_ba_session(). This matches the behavior of mt7921 and mt7996 drivers. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 82de6f30ec27..8236edb1fb48 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1300,22 +1300,22 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_RX_START: mt76_rx_aggr_start(&dev->mt76, &msta->deflink.wcid, tid, ssn, params->buf_size); - mt7925_mcu_uni_rx_ba(dev, params, true); + ret = mt7925_mcu_uni_rx_ba(dev, params, true); break; case IEEE80211_AMPDU_RX_STOP: mt76_rx_aggr_stop(&dev->mt76, &msta->deflink.wcid, tid); - mt7925_mcu_uni_rx_ba(dev, params, false); + ret = mt7925_mcu_uni_rx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_OPERATIONAL: mtxq->aggr = true; mtxq->send_bar = false; - mt7925_mcu_uni_tx_ba(dev, params, true); + ret = mt7925_mcu_uni_tx_ba(dev, params, true); break; case IEEE80211_AMPDU_TX_STOP_FLUSH: case IEEE80211_AMPDU_TX_STOP_FLUSH_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); - mt7925_mcu_uni_tx_ba(dev, params, false); + ret = mt7925_mcu_uni_tx_ba(dev, params, false); break; case IEEE80211_AMPDU_TX_START: set_bit(tid, &msta->deflink.wcid.ampdu_state); @@ -1324,6 +1324,11 @@ mt7925_ampdu_action(struct ieee80211_hw *hw, struct ieee80211_vif *vif, case IEEE80211_AMPDU_TX_STOP_CONT: mtxq->aggr = false; clear_bit(tid, &msta->deflink.wcid.ampdu_state); + /* MCU command may fail during beacon loss, but callback must + * always be called to complete the BA session teardown in + * mac80211. Otherwise the state machine gets stuck and triggers + * WARN in __ieee80211_stop_tx_ba_session(). + */ mt7925_mcu_uni_tx_ba(dev, params, false); ieee80211_stop_tx_ba_cb_irqsafe(vif, sta->addr, tid); break; -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v7 5/6] wifi: mt76: mt7925: add lockdep assertions for mutex verification 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac ` (3 preceding siblings ...) 2026-01-29 8:18 ` [PATCH v7 4/6] wifi: mt76: mt7925: add MCU command error handling in ampdu_action Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:18 ` [PATCH v7 6/6] wifi: mt76: mt7925: fix MLO ROC setup error handling Zac 2026-01-29 8:46 ` [PATCH 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Add lockdep_assert_held() calls to critical MCU functions to help catch mutex violations during development and debugging. This follows the pattern used in other mt76 drivers (mt7996, mt7915, mt7615). Functions with new assertions: - mt7925_mcu_add_bss_info(): Core BSS configuration MCU command - mt7925_mcu_sta_update(): Station record update MCU command - mt7925_mcu_uni_bss_ps(): Power save state MCU command These functions modify firmware state and must be called with the device mutex held to prevent race conditions. The lockdep assertions will trigger warnings at runtime if code paths exist that call these functions without proper mutex protection. Also fixes a potential NULL pointer issue in mt7925_mcu_sta_update() by initializing mlink to NULL and checking it before use. Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/mcu.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 1379bf6a26b5..2ed4af282120 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1532,6 +1532,8 @@ int mt7925_mcu_uni_bss_ps(struct mt792x_dev *dev, }, }; + lockdep_assert_held(&dev->mt76.mutex); + if (link_conf->vif->type != NL80211_IFTYPE_STATION) return -EOPNOTSUPP; @@ -2032,13 +2034,15 @@ int mt7925_mcu_sta_update(struct mt792x_dev *dev, .rcpi = to_rcpi(rssi), }; struct mt792x_sta *msta; - struct mt792x_link_sta *mlink; + struct mt792x_link_sta *mlink = NULL; + + lockdep_assert_held(&dev->mt76.mutex); if (link_sta) { msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); } - info.wcid = link_sta ? &mlink->wcid : &mvif->sta.deflink.wcid; + info.wcid = (link_sta && mlink) ? &mlink->wcid : &mvif->sta.deflink.wcid; info.newly = state != MT76_STA_INFO_STATE_ASSOC; return mt7925_mcu_sta_cmd(&dev->mphy, &info); @@ -2840,6 +2844,8 @@ int mt7925_mcu_add_bss_info(struct mt792x_phy *phy, struct mt792x_link_sta *mlink_bc; struct sk_buff *skb; + lockdep_assert_held(&dev->mt76.mutex); + skb = __mt7925_mcu_alloc_bss_req(&dev->mt76, &mconf->mt76, MT7925_BSS_UPDATE_MAX_SIZE); if (IS_ERR(skb)) -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH v7 6/6] wifi: mt76: mt7925: fix MLO ROC setup error handling 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac ` (4 preceding siblings ...) 2026-01-29 8:18 ` [PATCH v7 5/6] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac @ 2026-01-29 8:18 ` Zac 2026-01-29 8:46 ` [PATCH 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac 6 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 8:18 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Replace noisy WARN_ON_ONCE checks with silent returns in mt7925_mcu_set_mlo_roc(). During MLO setup, links may not be fully configured when ROC is requested. The WARN_ON_ONCE statements were triggering unnecessary kernel warnings during normal operation. Changes: - Replace WARN_ON_ONCE(!link_conf) with silent if (!link_conf) - Replace WARN_ON_ONCE(!links[i].chan) with silent check - Add explicit mconf NULL check before use - Use -ENOLINK error code to indicate link not ready - Replace continue with return to fail fast on invalid links The -ENOLINK error code properly indicates that the link is not yet ready for ROC, allowing upper layers to retry later without generating spurious kernel warnings. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/mcu.c | 20 +++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c index 2ed4af282120..5ca2106b1ce0 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/mcu.c @@ -1341,15 +1341,23 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf for (i = 0; i < ARRAY_SIZE(links); i++) { links[i].id = i ? __ffs(~BIT(mconf->link_id) & sel_links) : mconf->link_id; + link_conf = mt792x_vif_to_bss_conf(vif, links[i].id); - if (WARN_ON_ONCE(!link_conf)) - return -EPERM; + if (!link_conf) + return -ENOLINK; links[i].chan = link_conf->chanreq.oper.chan; - if (WARN_ON_ONCE(!links[i].chan)) - return -EPERM; + if (!links[i].chan) + /* Channel not configured yet - this can happen during + * MLO AP setup when links are being added sequentially. + * Return -ENOLINK to indicate link not ready. + */ + return -ENOLINK; links[i].mconf = mt792x_vif_to_link(mvif, links[i].id); + if (!links[i].mconf) + return -ENOLINK; + links[i].tag = links[i].id == mconf->link_id ? UNI_ROC_ACQUIRE : UNI_ROC_SUB_LINK; @@ -1364,8 +1372,8 @@ int mt7925_mcu_set_mlo_roc(struct mt792x_phy *phy, struct mt792x_bss_conf *mconf type = MT7925_ROC_REQ_JOIN; for (i = 0; i < ARRAY_SIZE(links) && i < hweight16(vif->active_links); i++) { - if (WARN_ON_ONCE(!links[i].mconf || !links[i].chan)) - continue; + if (!links[i].mconf || !links[i].chan) + return -ENOLINK; chan = links[i].chan; center_ch = ieee80211_frequency_to_channel(chan->center_freq); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [PATCH 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac ` (5 preceding siblings ...) 2026-01-29 8:18 ` [PATCH v7 6/6] wifi: mt76: mt7925: fix MLO ROC setup error handling Zac @ 2026-01-29 8:46 ` Zac 2026-01-29 9:05 ` [v7 PATCH 7/7] wifi: mt76: mt7925: add error logging for MLO ROC setup in set_links Zac 6 siblings, 1 reply; 113+ messages in thread From: Zac @ 2026-01-29 8:46 UTC (permalink / raw) To: nbd Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, ryder.lee, sean.wang, sean.wang, zac, zbowling Add NULL pointer checks for functions that return pointers to link-related structures throughout the mt7925 driver. During MLO state transitions, these functions can return NULL when link configuration is not synchronized. Functions protected: - mt792x_vif_to_bss_conf(): Returns link BSS configuration - mt792x_vif_to_link(): Returns driver link state - mt792x_sta_to_link(): Returns station link state Key changes: 1. mt7925_set_link_key(): - Check link_conf, mconf, mlink before use - During MLO roaming, allow key removal to succeed if link is already gone 2. mt7925_mac_link_sta_add(): - Check mlink and mconf before WCID allocation - Check link_conf before BSS info update - Add proper WCID cleanup on error paths (err_wcid label) - Check MCU return values and propagate errors 3. mt7925_mac_link_sta_assoc(): - Check mlink before use - Check link_conf and mconf before BSS info update 4. mt7925_mac_link_sta_remove(): - Check mlink before use - Check link_conf and mconf before cleanup operations Prevents crashes during: - BSSID roaming transitions - MLO setup and teardown - Hardware reset operations Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- .../net/wireless/mediatek/mt76/mt7925/main.c | 67 ++++++++++++++----- 1 file changed, 52 insertions(+), 15 deletions(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index fad3b1505f67..1400633712b7 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -612,6 +612,17 @@ static int mt7925_set_link_key(struct ieee80211_hw *hw, enum set_key_cmd cmd, link_sta = sta ? mt792x_sta_to_link_sta(vif, sta, link_id) : NULL; mconf = mt792x_vif_to_link(mvif, link_id); mlink = mt792x_sta_to_link(msta, link_id); + + if (!link_conf || !mconf || !mlink) { + /* During MLO roaming, link state may be torn down before + * mac80211 requests key removal. If removing a key and + * the link is already gone, consider it successfully removed. + */ + if (cmd != SET_KEY) + return 0; + return -EINVAL; + } + wcid = &mlink->wcid; wcid_keyidx = &wcid->hw_key_idx; @@ -864,12 +875,17 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return -EINVAL; + + mconf = mt792x_vif_to_link(mvif, link_id); + if (!mconf) + return -EINVAL; idx = mt76_wcid_alloc(dev->mt76.wcid_mask, MT792x_WTBL_STA - 1); if (idx < 0) return -ENOSPC; - mconf = mt792x_vif_to_link(mvif, link_id); mt76_wcid_init(&mlink->wcid, 0); mlink->wcid.sta = 1; mlink->wcid.idx = idx; @@ -888,21 +904,28 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt76_connac_pm_wake(&dev->mphy, &dev->pm); if (ret) - return ret; + goto err_wcid; mt7925_mac_wtbl_update(dev, idx, MT_WTBL_UPDATE_ADM_COUNT_CLEAR); link_conf = mt792x_vif_to_bss_conf(vif, link_id); + if (!link_conf) { + ret = -EINVAL; + goto err_wcid; + } /* should update bss info before STA add */ if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { if (ieee80211_vif_is_mld(vif)) - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, link_sta != mlink->pri_link); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, + link_sta != mlink->pri_link); else - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, false); + ret = mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, false); + if (ret) + goto err_wcid; } if (ieee80211_vif_is_mld(vif) && @@ -910,28 +933,35 @@ static int mt7925_mac_link_sta_add(struct mt76_dev *mdev, ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } else if (ieee80211_vif_is_mld(vif) && link_sta != mlink->pri_link) { ret = mt7925_mcu_sta_update(dev, mlink->pri_link, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_ASSOC); if (ret) - return ret; + goto err_wcid; } else { ret = mt7925_mcu_sta_update(dev, link_sta, vif, true, MT76_STA_INFO_STATE_NONE); if (ret) - return ret; + goto err_wcid; } mt76_connac_power_save_sched(&dev->mphy, &dev->pm); return 0; + +err_wcid: + rcu_assign_pointer(dev->mt76.wcid[idx], NULL); + mt76_wcid_cleanup(&dev->mt76, wcid); + mt76_wcid_mask_clear(dev->mt76.wcid_mask, idx); + mt76_connac_power_save_sched(&dev->mphy, &dev->pm); + return ret; } static int @@ -1039,6 +1069,8 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_sta->link_id); + if (!mlink) + return; mt792x_mutex_acquire(dev); @@ -1048,12 +1080,13 @@ static void mt7925_mac_link_sta_assoc(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, vif->bss_conf.link_id); } - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); - mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, - link_conf, link_sta, true); + if (mconf) + mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, + link_conf, link_sta, true); } ewma_avg_signal_init(&mlink->avg_ack_signal); @@ -1100,6 +1133,8 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, msta = (struct mt792x_sta *)link_sta->sta->drv_priv; mlink = mt792x_sta_to_link(msta, link_id); + if (!mlink) + return; mt7925_roc_abort_sync(dev); @@ -1113,10 +1148,12 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, link_conf = mt792x_vif_to_bss_conf(vif, link_id); - if (vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { + if (link_conf && vif->type == NL80211_IFTYPE_STATION && !link_sta->sta->tdls) { struct mt792x_bss_conf *mconf; mconf = mt792x_link_conf_to_mconf(link_conf); + if (!mconf) + goto out; if (ieee80211_vif_is_mld(vif)) mt792x_mac_link_bss_remove(dev, mconf, mlink); @@ -1124,7 +1161,7 @@ static void mt7925_mac_link_sta_remove(struct mt76_dev *mdev, mt7925_mcu_add_bss_info(&dev->phy, mconf->mt76.ctx, link_conf, link_sta, false); } - +out: spin_lock_bh(&mdev->sta_poll_lock); if (!list_empty(&mlink->wcid.poll_list)) list_del_init(&mlink->wcid.poll_list); -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* [v7 PATCH 7/7] wifi: mt76: mt7925: add error logging for MLO ROC setup in set_links 2026-01-29 8:46 ` [PATCH 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac @ 2026-01-29 9:05 ` Zac 0 siblings, 0 replies; 113+ messages in thread From: Zac @ 2026-01-29 9:05 UTC (permalink / raw) To: zac Cc: deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, linux, lorenzo, nbd, ryder.lee, sean.wang, sean.wang, zbowling Add error logging in mt7925_mac_set_links() when mt7925_set_mlo_roc() fails. Previously the error return was silently ignored since the callback function is void. The function now logs non-ENOLINK errors as warnings. ENOLINK errors are expected during link transitions when the link configuration is not yet ready, and mac80211 will retry the operation later. This complements the error handling changes in mt7925_mcu_set_mlo_roc() where WARN_ON_ONCE was replaced with proper -ENOLINK returns. Fixes: c948b5da6bbe ("wifi: mt76: mt7925: add Mediatek Wi-Fi7 driver for mt7925 device") Signed-off-by: Zac Bowling <zac@zacbowling.com> --- drivers/net/wireless/mediatek/mt76/mt7925/main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireless/mediatek/mt76/mt7925/main.c b/drivers/net/wireless/mediatek/mt76/mt7925/main.c index 0b088c448151..769c09e99d48 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7925/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/main.c @@ -1048,11 +1048,16 @@ mt7925_mac_set_links(struct mt76_dev *mdev, struct ieee80211_vif *vif) if (band == NL80211_BAND_2GHZ || (band == NL80211_BAND_5GHZ && secondary_band == NL80211_BAND_6GHZ)) { + int ret; + mt7925_abort_roc(mvif->phy, &mvif->bss_conf); mt792x_mutex_acquire(dev); - mt7925_set_mlo_roc(mvif->phy, &mvif->bss_conf, sel_links); + ret = mt7925_set_mlo_roc(mvif->phy, &mvif->bss_conf, sel_links); + if (ret && ret != -ENOLINK) + dev_warn(dev->mt76.dev, + "MLO ROC setup failed in set_links: %d\n", ret); mt792x_mutex_release(dev); } -- 2.52.0 ^ permalink raw reply related [flat|nested] 113+ messages in thread
* Re: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 2026-01-20 8:25 ` Sean Wang @ 2026-01-20 11:42 ` kernel test robot 2026-01-20 13:26 ` kernel test robot 2 siblings, 0 replies; 113+ messages in thread From: kernel test robot @ 2026-01-20 11:42 UTC (permalink / raw) To: Zac, sean.wang Cc: oe-kbuild-all, deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling Hi Zac, kernel test robot noticed the following build warnings: [auto build test WARNING on wireless-next/main] [also build test WARNING on wireless/main linus/master v6.19-rc6 next-20260119] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Zac/wifi-mt76-fix-list-corruption-in-mt76_wcid_cleanup/20260120-143842 base: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git main patch link: https://lore.kernel.org/r/20260120062854.126501-12-zac%40zacbowling.com patch subject: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20260120/202601201954.zxO1N1DS-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 15.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260120/202601201954.zxO1N1DS-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202601201954.zxO1N1DS-lkp@intel.com/ All warnings (new ones prefixed by >>): In file included from include/linux/printk.h:621, from include/linux/kernel.h:31, from include/linux/skbuff.h:13, from include/linux/if_ether.h:19, from include/linux/etherdevice.h:20, from drivers/net/wireless/mediatek/mt76/mt7925/main.c:4: drivers/net/wireless/mediatek/mt76/mt7925/main.c: In function 'mt7925_set_roc': >> drivers/net/wireless/mediatek/mt76/mt7925/main.c:610:33: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'unsigned int' [-Wformat=] 610 | "mt7925: ROC throttled, %lu ms remaining\n", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/dynamic_debug.h:231:29: note: in definition of macro '__dynamic_func_call_cls' 231 | func(&id, ##__VA_ARGS__); \ | ^~~~~~~~~~~ include/linux/dynamic_debug.h:261:9: note: in expansion of macro '_dynamic_func_call_cls' 261 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~ include/linux/dynamic_debug.h:284:9: note: in expansion of macro '_dynamic_func_call' 284 | _dynamic_func_call(fmt, __dynamic_dev_dbg, \ | ^~~~~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:9: note: in expansion of macro 'dynamic_dev_dbg' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ^~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:30: note: in expansion of macro 'dev_fmt' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ^~~~~~~ drivers/net/wireless/mediatek/mt76/mt7925/main.c:609:25: note: in expansion of macro 'dev_dbg' 609 | dev_dbg(phy->dev->mt76.dev, | ^~~~~~~ drivers/net/wireless/mediatek/mt76/mt7925/main.c:610:59: note: format string is defined here 610 | "mt7925: ROC throttled, %lu ms remaining\n", | ~~^ | | | long unsigned int | %u drivers/net/wireless/mediatek/mt76/mt7925/main.c: In function 'mt7925_set_mlo_roc': drivers/net/wireless/mediatek/mt76/mt7925/main.c:661:33: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'unsigned int' [-Wformat=] 661 | "mt7925: MLO ROC throttled, %lu ms remaining\n", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/dynamic_debug.h:231:29: note: in definition of macro '__dynamic_func_call_cls' 231 | func(&id, ##__VA_ARGS__); \ | ^~~~~~~~~~~ include/linux/dynamic_debug.h:261:9: note: in expansion of macro '_dynamic_func_call_cls' 261 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~ include/linux/dynamic_debug.h:284:9: note: in expansion of macro '_dynamic_func_call' 284 | _dynamic_func_call(fmt, __dynamic_dev_dbg, \ | ^~~~~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:9: note: in expansion of macro 'dynamic_dev_dbg' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ^~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:30: note: in expansion of macro 'dev_fmt' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ^~~~~~~ drivers/net/wireless/mediatek/mt76/mt7925/main.c:660:25: note: in expansion of macro 'dev_dbg' 660 | dev_dbg(phy->dev->mt76.dev, | ^~~~~~~ drivers/net/wireless/mediatek/mt76/mt7925/main.c:661:63: note: format string is defined here 661 | "mt7925: MLO ROC throttled, %lu ms remaining\n", | ~~^ | | | long unsigned int | %u vim +610 drivers/net/wireless/mediatek/mt76/mt7925/main.c 592 593 static int mt7925_set_roc(struct mt792x_phy *phy, 594 struct mt792x_bss_conf *mconf, 595 struct ieee80211_channel *chan, 596 int duration, 597 enum mt7925_roc_req type) 598 { 599 unsigned long throttle; 600 int err; 601 602 /* Check rate limiting - if in backoff period, wait or return busy */ 603 throttle = mt7925_roc_throttle_check(phy); 604 if (throttle) { 605 /* For short backoffs, wait; for longer ones, return busy */ 606 if (throttle < msecs_to_jiffies(200)) { 607 msleep(jiffies_to_msecs(throttle)); 608 } else { 609 dev_dbg(phy->dev->mt76.dev, > 610 "mt7925: ROC throttled, %lu ms remaining\n", 611 jiffies_to_msecs(throttle)); 612 return -EBUSY; 613 } 614 } 615 616 /* Clear stale abort flag from previous ROC */ 617 clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); 618 619 if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) 620 return -EBUSY; 621 622 phy->roc_grant = false; 623 624 err = mt7925_mcu_set_roc(phy, mconf, chan, duration, type, 625 ++phy->roc_token_id); 626 if (err < 0) { 627 clear_bit(MT76_STATE_ROC, &phy->mt76->state); 628 goto out; 629 } 630 631 if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { 632 mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); 633 clear_bit(MT76_STATE_ROC, &phy->mt76->state); 634 mt7925_roc_record_timeout(phy); 635 err = -ETIMEDOUT; 636 } else { 637 /* Successful ROC - reset timeout tracking */ 638 mt7925_roc_clear_timeout(phy); 639 } 640 641 out: 642 return err; 643 } 644 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 113+ messages in thread
* Re: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 2026-01-20 8:25 ` Sean Wang 2026-01-20 11:42 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions kernel test robot @ 2026-01-20 13:26 ` kernel test robot 2 siblings, 0 replies; 113+ messages in thread From: kernel test robot @ 2026-01-20 13:26 UTC (permalink / raw) To: Zac, sean.wang Cc: llvm, oe-kbuild-all, deren.wu, kvalo, linux-kernel, linux-mediatek, linux-wireless, lorenzo, nbd, ryder.lee, sean.wang, stable, linux, zbowling, Zac Bowling Hi Zac, kernel test robot noticed the following build warnings: [auto build test WARNING on wireless-next/main] [also build test WARNING on wireless/main linus/master v6.19-rc6 next-20260119] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Zac/wifi-mt76-fix-list-corruption-in-mt76_wcid_cleanup/20260120-143842 base: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git main patch link: https://lore.kernel.org/r/20260120062854.126501-12-zac%40zacbowling.com patch subject: [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions config: i386-randconfig-015-20260120 (https://download.01.org/0day-ci/archive/20260120/202601202144.ee4DM9Pz-lkp@intel.com/config) compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260120/202601202144.ee4DM9Pz-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202601202144.ee4DM9Pz-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/net/wireless/mediatek/mt76/mt7925/main.c:611:5: warning: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Wformat] 610 | "mt7925: ROC throttled, %lu ms remaining\n", | ~~~ | %u 611 | jiffies_to_msecs(throttle)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:39: note: expanded from macro 'dev_dbg' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ~~~ ^~~~~~~~~~~ include/linux/dynamic_debug.h:285:19: note: expanded from macro 'dynamic_dev_dbg' 285 | dev, fmt, ##__VA_ARGS__) | ~~~ ^~~~~~~~~~~ include/linux/dynamic_debug.h:261:59: note: expanded from macro '_dynamic_func_call' 261 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~ include/linux/dynamic_debug.h:259:65: note: expanded from macro '_dynamic_func_call_cls' 259 | __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~ include/linux/dynamic_debug.h:231:15: note: expanded from macro '__dynamic_func_call_cls' 231 | func(&id, ##__VA_ARGS__); \ | ^~~~~~~~~~~ drivers/net/wireless/mediatek/mt76/mt7925/main.c:662:5: warning: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Wformat] 661 | "mt7925: MLO ROC throttled, %lu ms remaining\n", | ~~~ | %u 662 | jiffies_to_msecs(throttle)); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/dev_printk.h:165:39: note: expanded from macro 'dev_dbg' 165 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__) | ~~~ ^~~~~~~~~~~ include/linux/dynamic_debug.h:285:19: note: expanded from macro 'dynamic_dev_dbg' 285 | dev, fmt, ##__VA_ARGS__) | ~~~ ^~~~~~~~~~~ include/linux/dynamic_debug.h:261:59: note: expanded from macro '_dynamic_func_call' 261 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~ include/linux/dynamic_debug.h:259:65: note: expanded from macro '_dynamic_func_call_cls' 259 | __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, ##__VA_ARGS__) | ^~~~~~~~~~~ include/linux/dynamic_debug.h:231:15: note: expanded from macro '__dynamic_func_call_cls' 231 | func(&id, ##__VA_ARGS__); \ | ^~~~~~~~~~~ 2 warnings generated. vim +611 drivers/net/wireless/mediatek/mt76/mt7925/main.c 592 593 static int mt7925_set_roc(struct mt792x_phy *phy, 594 struct mt792x_bss_conf *mconf, 595 struct ieee80211_channel *chan, 596 int duration, 597 enum mt7925_roc_req type) 598 { 599 unsigned long throttle; 600 int err; 601 602 /* Check rate limiting - if in backoff period, wait or return busy */ 603 throttle = mt7925_roc_throttle_check(phy); 604 if (throttle) { 605 /* For short backoffs, wait; for longer ones, return busy */ 606 if (throttle < msecs_to_jiffies(200)) { 607 msleep(jiffies_to_msecs(throttle)); 608 } else { 609 dev_dbg(phy->dev->mt76.dev, 610 "mt7925: ROC throttled, %lu ms remaining\n", > 611 jiffies_to_msecs(throttle)); 612 return -EBUSY; 613 } 614 } 615 616 /* Clear stale abort flag from previous ROC */ 617 clear_bit(MT76_STATE_ROC_ABORT, &phy->mt76->state); 618 619 if (test_and_set_bit(MT76_STATE_ROC, &phy->mt76->state)) 620 return -EBUSY; 621 622 phy->roc_grant = false; 623 624 err = mt7925_mcu_set_roc(phy, mconf, chan, duration, type, 625 ++phy->roc_token_id); 626 if (err < 0) { 627 clear_bit(MT76_STATE_ROC, &phy->mt76->state); 628 goto out; 629 } 630 631 if (!wait_event_timeout(phy->roc_wait, phy->roc_grant, 4 * HZ)) { 632 mt7925_mcu_abort_roc(phy, mconf, phy->roc_token_id); 633 clear_bit(MT76_STATE_ROC, &phy->mt76->state); 634 mt7925_roc_record_timeout(phy); 635 err = -ETIMEDOUT; 636 } else { 637 /* Successful ROC - reset timeout tracking */ 638 mt7925_roc_clear_timeout(phy); 639 } 640 641 out: 642 return err; 643 } 644 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 113+ messages in thread
end of thread, other threads:[~2026-01-29 9:06 UTC | newest] Thread overview: 113+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-31 5:29 [PATCH] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration loops Zac Bowling 2025-12-31 22:37 ` [PATCH] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort paths Zac Bowling 2026-01-01 0:22 ` [PATCH 2/3] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling 2026-01-01 0:23 ` [PATCH 3/3] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling 2026-01-01 0:41 ` Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac Bowling 2026-01-01 6:25 ` [PATCH] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling 2026-01-02 20:03 ` [PATCH v2 0/6] wifi: mt76: mt7925/mt792x: additional stability fixes Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup when channel not configured Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling 2026-01-03 6:46 ` Sean Wang 2026-01-03 18:42 ` Zac Bowling 2026-01-15 7:19 ` Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling 2026-01-02 20:03 ` [PATCH] wifi: mt76: mt7925: add NULL checks and error handling for MCU calls Zac Bowling 2026-01-02 20:05 ` [PATCH] wifi: mt76: mt7925: comprehensive stability fixes Zac Bowling 2026-01-03 6:25 ` Sean Wang 2026-01-03 19:11 ` Zac Bowling 2026-01-05 0:26 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: " Zac Bowling 2026-01-05 0:26 ` [PATCH 01/17] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac Bowling 2026-01-05 0:26 ` [PATCH 02/17] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac Bowling 2026-01-05 0:26 ` [PATCH 03/17] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac Bowling 2026-01-05 0:26 ` [PATCH 04/17] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac Bowling 2026-01-05 0:26 ` [PATCH 05/17] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac Bowling 2026-01-05 0:26 ` [PATCH 06/17] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac Bowling 2026-01-05 0:26 ` [PATCH 07/17] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac Bowling 2026-01-05 0:26 ` [PATCH 08/17] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac Bowling 2026-01-05 0:26 ` [PATCH 09/17] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac Bowling 2026-01-05 0:26 ` [PATCH 10/17] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac Bowling 2026-01-05 0:26 ` [PATCH 11/17] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac Bowling 2026-01-05 0:26 ` [PATCH 12/17] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac Bowling 2026-01-05 0:26 ` [PATCH 13/17] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac Bowling 2026-01-05 0:26 ` [PATCH 14/17] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac Bowling 2026-01-05 0:26 ` [PATCH 15/17] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac Bowling 2026-01-05 0:26 ` [PATCH 16/17] wifi: mt76: mt7925: add mutex protection in resume path Zac Bowling 2026-01-05 0:26 ` [PATCH 17/17] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac Bowling 2026-01-11 3:13 ` Zac Bowling 2026-01-11 3:36 ` Zac Bowling 2026-01-16 0:15 ` [PATCH v3 00/17] wifi: mt76: mt7925/mt792x: comprehensive stability fixes Sean Wang 2026-01-16 0:43 ` Zac Bowling 2026-01-16 1:04 ` [PATCH v4 00/21] wifi: mt76: mt7925/mt7921: stability and MLO fixes Zac 2026-01-16 1:04 ` [PATCH v4 01/21] wifi: mt76: mt7925: fix NULL pointer dereference in vif iteration Zac 2026-01-16 1:05 ` [PATCH v4 02/21] wifi: mt76: mt7925: fix missing mutex protection in reset and ROC abort Zac 2026-01-16 1:05 ` [PATCH v4 03/21] wifi: mt76: mt7925: fix missing mutex protection in runtime PM and MLO PM Zac 2026-01-16 1:05 ` [PATCH v4 04/21] wifi: mt76: mt7925: add NULL checks in MCU STA TLV functions Zac 2026-01-16 1:05 ` [PATCH v4 05/21] wifi: mt76: mt7925: add NULL checks for link_conf and mlink in main.c Zac 2026-01-16 1:05 ` [PATCH v4 06/21] wifi: mt76: mt7925: add error handling for AMPDU MCU commands Zac 2026-01-16 1:05 ` [PATCH v4 07/21] wifi: mt76: mt7925: add error handling for BSS info MCU command in sta_add Zac 2026-01-16 1:05 ` [PATCH v4 08/21] wifi: mt76: mt7925: add error handling for BSS info in key setup Zac 2026-01-16 1:05 ` [PATCH v4 09/21] wifi: mt76: mt7925: add NULL checks in MLO link and chanctx functions Zac 2026-01-16 1:05 ` [PATCH v4 10/21] wifi: mt76: mt792x: fix NULL pointer dereference in TX path Zac 2026-01-16 1:05 ` [PATCH v4 11/21] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac 2026-01-16 1:05 ` [PATCH v4 12/21] wifi: mt76: mt7925: fix key removal failure during MLO roaming Zac 2026-01-16 1:05 ` [PATCH v4 13/21] wifi: mt76: mt7925: fix kernel warning in MLO ROC setup Zac 2026-01-16 1:05 ` [PATCH v4 14/21] wifi: mt76: mt7925: add NULL checks for MLO link pointers in MCU functions Zac 2026-01-16 1:05 ` [PATCH v4 15/21] wifi: mt76: mt792x: fix firmware reload failure after previous load crash Zac 2026-01-16 1:05 ` [PATCH v4 16/21] wifi: mt76: mt7925: add mutex protection in resume path Zac 2026-01-16 1:05 ` [PATCH v4 17/21] wifi: mt76: mt7925: add NULL checks in link station and TX queue setup Zac 2026-01-16 1:05 ` [PATCH v4 18/21] wifi: mt76: mt7921: fix missing mutex protection in multiple paths Zac 2026-01-16 1:05 ` [PATCH v4 19/21] wifi: mt76: mt7921: fix mutex deadlocks " Zac 2026-01-16 1:05 ` [PATCH v4 20/21] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac 2026-01-16 1:05 ` [PATCH v4 21/21] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 2026-01-20 6:28 ` [PATCH v5 00/11] wifi: mt76: mt7925/mt7921 stability fixes Zac 2026-01-20 6:28 ` [PATCH 01/11] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac 2026-01-20 6:28 ` [PATCH 02/11] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac 2026-01-20 7:04 ` Greg KH 2026-01-20 6:28 ` [PATCH 03/11] wifi: mt76: mt7921: add mutex protection in critical paths Zac 2026-01-20 6:28 ` [PATCH 04/11] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac 2026-01-20 6:28 ` [PATCH 05/11] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac 2026-01-20 6:28 ` [PATCH 06/11] wifi: mt76: mt7925: add mutex protection in critical paths Zac 2026-01-20 6:28 ` [PATCH 07/11] wifi: mt76: mt7925: add MCU command error handling Zac 2026-01-20 6:28 ` [PATCH 08/11] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac 2026-01-20 6:28 ` [PATCH 09/11] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac 2026-01-20 6:28 ` [PATCH 10/11] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 2026-01-20 6:28 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 2026-01-20 8:25 ` Sean Wang 2026-01-20 17:59 ` Zac Bowling 2026-01-20 20:10 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, " Zac 2026-01-20 20:10 ` [PATCH 01/13] wifi: mt76: mt7925: fix potential deadlock in mt7925_roc_abort_sync Zac 2026-01-20 20:10 ` [PATCH 02/13] wifi: mt76: fix list corruption in mt76_wcid_cleanup Zac 2026-01-20 20:10 ` [PATCH 03/13] wifi: mt76: mt792x: fix NULL pointer and firmware reload issues Zac 2026-01-20 20:10 ` [PATCH 04/13] wifi: mt76: mt7921: add mutex protection in critical paths Zac 2026-01-27 10:59 ` Felix Fietkau 2026-01-29 6:19 ` Zac Bowling 2026-01-20 20:10 ` [PATCH 05/13] wifi: mt76: mt7921: fix deadlock in sta removal and suspend ROC abort Zac 2026-01-20 20:10 ` [PATCH 06/13] wifi: mt76: mt7925: add comprehensive NULL pointer protection for MLO Zac 2026-01-20 20:10 ` [PATCH 08/13] wifi: mt76: mt7925: add MCU command error handling Zac 2026-01-20 20:10 ` [PATCH 09/13] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac 2026-01-20 20:10 ` [PATCH 10/13] wifi: mt76: mt7925: fix MLO roaming and ROC setup issues Zac 2026-01-20 20:10 ` [PATCH 11/13] wifi: mt76: mt7925: fix BA session teardown during beacon loss Zac 2026-01-20 20:10 ` [PATCH 12/13] wifi: mt76: mt7925: fix ROC deadlocks and race conditions Zac 2026-01-27 11:06 ` Felix Fietkau 2026-01-20 20:10 ` [PATCH 13/13] wifi: mt76: mt7925: fix double wcid initialization race condition Zac 2026-01-27 10:58 ` [PATCH v6 00/13] wifi: mt76: stability fixes for deadlocks, NULL derefs, and race conditions Felix Fietkau 2026-01-29 8:18 ` [PATCH v7 0/6] wifi: mt76: mt7925: MLO stability fixes Zac 2026-01-29 8:18 ` [PATCH v7 1/6] wifi: mt76: mt7925: fix double wcid initialization race condition Zac 2026-01-29 8:18 ` [PATCH v7 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac 2026-01-29 8:18 ` [PATCH v7 3/6] wifi: mt76: mt7925: add mutex protection in critical paths Zac 2026-01-29 8:18 ` [PATCH v7 4/6] wifi: mt76: mt7925: add MCU command error handling in ampdu_action Zac 2026-01-29 8:18 ` [PATCH v7 5/6] wifi: mt76: mt7925: add lockdep assertions for mutex verification Zac 2026-01-29 8:18 ` [PATCH v7 6/6] wifi: mt76: mt7925: fix MLO ROC setup error handling Zac 2026-01-29 8:46 ` [PATCH 2/6] wifi: mt76: mt7925: add NULL pointer protection for MLO state transitions Zac 2026-01-29 9:05 ` [v7 PATCH 7/7] wifi: mt76: mt7925: add error logging for MLO ROC setup in set_links Zac 2026-01-20 11:42 ` [PATCH 11/11] wifi: mt76: mt7925: fix ROC deadlocks and race conditions kernel test robot 2026-01-20 13:26 ` kernel test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox