* Re: [BUG] mt7921e: Intermittent connection failure
[not found] ` <f264b392-37bc-4b31-ac0e-768466f2b962@altlinux.org>
@ 2026-04-01 22:58 ` Sean Wang
[not found] ` <651b9626-0c2c-4993-829a-3259141109dc@altlinux.org>
0 siblings, 1 reply; 4+ messages in thread
From: Sean Wang @ 2026-04-01 22:58 UTC (permalink / raw)
To: silverducks
Cc: linux-wireless, nbd, lorenzo, ryder.lee, shayne.chen, sean.wang,
matthias.bgg, angelogioacchino.delregno,
moderated list:ARM/Mediatek SoC support
Hi,
On Tue, Mar 31, 2026 at 2:40 AM silverducks <silverducks@altlinux.org> wrote:
>
> Greetings!
>
> I apologize for poor formatting in the previous email. I did not realize all
> plain text files' contents would be visible on the mailing list.
> I am attaching an archive containing the same files as in previous email for
> convenience.
> Given compression, I can also avoid using external hosting, which I
> presume is
> preferred, so I am including all relevant logs in the archive as well.
> I am also including original email text just in case.
I think the current test setup is still mixing too many variables, so
it is hard to tell what is actually triggering the issue.
In particular, if the goal is to test the NetworkManager path, the
script should not also manually manage wpa_supplicant, and iwd should
not be part of the same test either. NetworkManager normally manages
the Wi-Fi backend itself, so mixing manual wpa_supplicant handling,
iwd, and NetworkManager in one setup makes the result difficult to
interpret.
Could you first simplify the setup and test one path at a time?
If you want to test NetworkManager, use only NetworkManager, for
example by using nmcli to explicitly control the connection steps.
If you want to test plain wpa_supplicant, stop NetworkManager
completely and use only wpa_supplicant + wpa_cli. I would suggest
starting with this path, since that is also the setup I usually use
for testing.
If you want to test iwd, please test it separately as well.
Also suggest to avoid suspend/resume or hibernation for now.
The log you shared includes a clear S4 resume path (ACPI: PM: Waking
up from system sleep state S4 and pci_pm_restore returns -110), which
does not match a simple reconnect or module reload test.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] mt7921e: Intermittent connection failure
[not found] ` <651b9626-0c2c-4993-829a-3259141109dc@altlinux.org>
@ 2026-04-14 6:22 ` silverducks
2026-04-16 21:59 ` Sean Wang
0 siblings, 1 reply; 4+ messages in thread
From: silverducks @ 2026-04-14 6:22 UTC (permalink / raw)
To: Sean Wang
Cc: linux-wireless, nbd, lorenzo, ryder.lee, shayne.chen, sean.wang,
matthias.bgg, angelogioacchino.delregno,
moderated list:ARM/Mediatek SoC support
[-- Attachment #1: Type: text/plain, Size: 454 bytes --]
Greetings!
Update:
New version of the script is less consistent at reproducing the bug.
I've also noticed that on patched kernel it triggers timeout way
more often than older script. I've rolled back to the older one,
which relies on the NM autoconnect feature, though retained cleanup
changes.
I'm guessing direct nmcli command is often preventing NM from doing
whatever it is usually doing that triggers the error.
Script and reruns with it attached.
[-- Attachment #2: attachments.tar.gz --]
[-- Type: application/gzip, Size: 158890 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] mt7921e: Intermittent connection failure
2026-04-14 6:22 ` silverducks
@ 2026-04-16 21:59 ` Sean Wang
2026-04-17 10:51 ` silverducks
0 siblings, 1 reply; 4+ messages in thread
From: Sean Wang @ 2026-04-16 21:59 UTC (permalink / raw)
To: silverducks
Cc: linux-wireless, nbd, lorenzo, ryder.lee, shayne.chen, sean.wang,
matthias.bgg, angelogioacchino.delregno,
moderated list:ARM/Mediatek SoC support
Hi,
Thanks for the update and for sharing the revised script and rerun logs.
I tried to reproduce the issue with your script on an mt7921u, but I
could not reproduce the timeout on my side.
In my run, I got 398 reload cycles and 398 successful reconnects:
grep Reloading test_log.txt | wc -l
388
grep Connection test_log.txt | wc -l
398
The log is consistently like this after module reload [1]
If this were a generic race in the common MCU command layer, I would
expect it to reproduce on mt7921u as well.
Also, could you record and save the "iw event" log when you run the
test? That would help show what userspace-triggered
activities are happening. we can mimic the same sequence using wpa_cli
[1]
Starting WiFi mt7921u reset loop (using NetworkManager)
Making sure network services are down...
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Status: Connecting
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Status: Connecting
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
Reloading module mt7921u...
Starting network services...
Checking for connection...
Checking for connection...
Connected.
Connection succeeded. Cleaning up.
----------------------------------------
...
and so on
On Tue, Apr 14, 2026 at 1:22 AM silverducks <silverducks@altlinux.org> wrote:
>
> Greetings!
>
> Update:
> New version of the script is less consistent at reproducing the bug.
> I've also noticed that on patched kernel it triggers timeout way
> more often than older script. I've rolled back to the older one,
> which relies on the NM autoconnect feature, though retained cleanup
> changes.
> I'm guessing direct nmcli command is often preventing NM from doing
> whatever it is usually doing that triggers the error.
> Script and reruns with it attached.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [BUG] mt7921e: Intermittent connection failure
2026-04-16 21:59 ` Sean Wang
@ 2026-04-17 10:51 ` silverducks
0 siblings, 0 replies; 4+ messages in thread
From: silverducks @ 2026-04-17 10:51 UTC (permalink / raw)
To: Sean Wang
Cc: linux-wireless, nbd, lorenzo, ryder.lee, shayne.chen, sean.wang,
matthias.bgg, angelogioacchino.delregno,
moderated list:ARM/Mediatek SoC support
[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]
Greetings!
On 17/04/2026 00:59, Sean Wang wrote:
> I tried to reproduce the issue with your script on an mt7921u, but I
> could not reproduce the timeout on my side.
> <...>
> If this were a generic race in the common MCU command layer, I would
> expect it to reproduce on mt7921u as well.
It sounds like you are referring to the original bug as timeout here.
I may not have been clear about this, let me restate it more
precisely. My apologies if this is redundant.
--------------
mt7921e 0000:02:00.0: Message 0004001b (seq 9) timeout
--------------
This timeout error appears with aforementioned patch applied.
It seems to "replace" the bug during NM-autoconnect-reliant runs -
it's occurrence rate is the same and the original bug no longer
triggers. It causes the driver to reset:
--------------
[ 124.601521] mt7921e 0000:02:00.0: Message 0004001b (seq 5) timeout
[ 124.689461] mt7921e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build
Time: 20250625153620a
[ 124.726504] mt7921e 0000:02:00.0: WM Firmware Version: ____010000,
Build Time: 20250625153703
--------------
On the other hand, the original bug makes the device unresponsive,
but does not cause an error message or cause the driver to reset [1].
If I use nmcli to connect instead (as in previous revision of the
script) with the patch applied, this timeout starts to appear
significantly more frequently. It's not impossible that it's a
statistical outlier, but I did run this a number of times.
Without the patch, using nmcli to connect mostly gets rid of the bug:
occurrence rate drops considerably, though not to zero.
I think it simply allows less time for NM to trigger the bug.
> If this were a generic race in the common MCU command layer, I would
> expect it to reproduce on mt7921u as well.
Perhaps it could still exist in the common layer, if the USB version
is not triggering it due to slower bus speed or latency, or
some other difference?
Though, I'm not saying that aforementioned patch fixes the core issue.
More likely it only prevents the trigger from occurring.
> Also, could you record and save the "iw event" log when you run the
> test? That would help show what userspace-triggered
> activities are happening. we can mimic the same sequence using wpa_cli
Of course. Attaching logs for module reload and reconnecting cases.
Thank you for your time.
[1]
<cut>
[ 208.992595] wlp2s0: deauthenticating from 34:60:f9:99:08:38 by local
choice (Reason: 3=DEAUTH_LEAVING)
[ 209.611215] mt7921e 0000:02:00.0: ASIC revision: 79610010
[ 209.697087] mt7921e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build
Time: 20250625153620a
[ 209.734750] mt7921e 0000:02:00.0: WM Firmware Version: ____010000,
Build Time: 20250625153703
[ 210.579495] mt7921e 0000:02:00.0 wlp2s0: renamed from wlan0
[ 215.603597] wlp2s0: authenticate with 34:60:f9:99:08:38 (local
address=14:5a:fc:77:82:1f)
[ 215.718389] wlp2s0: send auth to 34:60:f9:99:08:38 (try 1/3)
[ 218.790084] wlp2s0: send auth to 34:60:f9:99:08:38 (try 2/3)
[ 220.723047] wlp2s0: aborting authentication with 34:60:f9:99:08:38 by
local choice (Reason: 3=DEAUTH_LEAVING)
[ 234.992512] wlp2s0: authenticate with 34:60:f9:99:08:38 (local
address=14:5a:fc:77:82:1f)
[ 236.005909] wlp2s0: send auth to 34:60:f9:99:08:38 (try 1/3)
[ 239.078085] wlp2s0: send auth to 34:60:f9:99:08:38 (try 2/3)
[ 240.695078] wlp2s0: aborting authentication with 34:60:f9:99:08:38 by
local choice (Reason: 3=DEAUTH_LEAVING)
[ 254.926181] wlp2s0: authenticate with 34:60:f9:99:08:38 (local
address=14:5a:fc:77:82:1f)
[ 255.973995] wlp2s0: send auth to 34:60:f9:99:08:38 (try 1/3)
[ 257.061953] wlp2s0: send auth to 34:60:f9:99:08:38 (try 2/3)
[ 258.149899] wlp2s0: send auth to 34:60:f9:99:08:38 (try 3/3)
[ 258.173202] wlp2s0: authentication with 34:60:f9:99:08:38 timed out
<end of log>
[-- Attachment #2: attachments.tar.gz --]
[-- Type: application/gzip, Size: 169901 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-17 10:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <59f58f03-6dee-4380-80d6-7e2778b5f14b@altlinux.org>
[not found] ` <f264b392-37bc-4b31-ac0e-768466f2b962@altlinux.org>
2026-04-01 22:58 ` [BUG] mt7921e: Intermittent connection failure Sean Wang
[not found] ` <651b9626-0c2c-4993-829a-3259141109dc@altlinux.org>
2026-04-14 6:22 ` silverducks
2026-04-16 21:59 ` Sean Wang
2026-04-17 10:51 ` silverducks
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox