From: James Prestwood <prestwoj@gmail.com>
To: Martin Petzold <martin.petzold@tavla.de>
Cc: iwd@lists.linux.dev, Arend Van Spriel <arend.vanspriel@broadcom.com>
Subject: Re: Connection loss (IWD HEAD with latest OWE / BSS selection patches) - brcmfmac driver
Date: Mon, 4 Nov 2024 04:36:31 -0800 [thread overview]
Message-ID: <f489d856-c788-47c5-8739-325a11d7ff42@gmail.com> (raw)
In-Reply-To: <f9ca5733-7cda-4494-8b6d-7a4de4157da2@tavla.de>
On 11/3/24 3:13 PM, Martin Petzold wrote:
> Dear James,
>
> Am 25.10.24 um 17:17 schrieb James Prestwood:
>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I open a new thread for this one: During the last weeks I
>>>>>>>>>> have seen connection losses for 30+ minutes, sometimes even
>>>>>>>>>> hours or just now even forever (IWD HEAD with v2 OWE / BSS
>>>>>>>>>> selection patches). Driver is brcmfmac (NXP 6.1.36 kernel)
>>>>>>>>>> and chip is BCM4339 (Laird LWB5).
>>>>>>>>>>
>>>>>>>>>> It happens in a) single router environment (WPA2-PSK;
>>>>>>>>>> Touchstone TG3442DE), and b) router + repeater environment
>>>>>>>>>> (WPA2 CCMP; Fritz!Box + Fritz!Repeater), and maybe also in
>>>>>>>>>> the WPA3 OWE Transition network (yesterday lost a connection
>>>>>>>>>> again).
>>>>>>>>>
>>>>>>>>> I lost now again 2 of 10 devices in the WPA3 OWE network (with
>>>>>>>>> roaming). However, now they don't disappear all after a
>>>>>>>>> shorter while. It seems to be later.
>>>>>>>>>
>>>>>>>>> I also lost one device in a Router+Repeater WPA2 (CCMP)
>>>>>>>>> network. It is confirmed here on router side, that the device
>>>>>>>>> is disconnected. Since more than a day.
>>>>>>>>
>>>>>>>> We can't do anything without logs. If you suspect its the
>>>>>>>> blacklist you can lower the blacklist time down in main.conf:
>>>>>>>>
>>>>>>>> [
>
> I am still losing devices. Sometimes they come back again, but mostly
> do not re-connect. I have observed the following:
>
> - Connection exists for several hours until about one day, or two.
> Then gone for several hours or mostly forever.
> - For FritzBox+FritzRepeater I have seen the connection coming back
> after like a day (here connection loss was also confirmed on router
> side!)
> - For the Aruba enterprise environment the connection never came back
> (until now no AP logs - waiting for an answer)
> - After reboot the connection comes back
> - It occurs only in an environment with multiple APs with same SSID
> (i.e. roaming environment), however my single AP environments have all
> strong signal
> - Some devices with identical configuration in this environment DO NOT
> get lost, those seem to have quite strong signal (maybe they don't roam)
> - Other devices in the same environment work without any problems
> (Intel+NetworkManager) and the APs are Aruba enterprise grade
> - I see almost the same in the Aruba enterprise environment, but ALSO
> in a FritzBox + FritzRepeater environment
> - We had a bug in our web socket connection, causing to many IWD
> requests. However, this was fixed. And why are all the other devices
> okay? Maybe co-incidence with roaming and anything related to dropping
> and re-connecting web socket connection.
>
> Please find attached my currently available debug logs (they are a few
> days old, but I am quite sure this is the connection loss situation).
> These logs are from the FritzBox+FritzRepeater environment. There are
> no brcmfmac messages (but also no special debug level configured here)!
>
> I have now also disabled WiFi power saving and will deploy to the
> environment...hoping the best.
>
> Maybe you could check the logs and have an idea?
Looks like the same thing as the last logs you sent. IWD tries to
connect (sends CMD_CONNECT to the kernel) but gets no associated
CMD_CONNECT event after that which causes IWD to wait indefinitely for
that event. This, again, appears like a driver problem because its
expected that the kernel tells userspace the result of the CMD_CONNECT
request.
Only similarity I can see between the two sets of logs is there is a
failed connection just prior to the hang. IWD then attempts to connect
again but the 4-way handshake is never started and this results in a
failure with status 16 (group key handshake timeout). In your latest set
of logs IWD actually again tries to connect to a different BSS and gets
status 16 before trying yet again and hanging.
This actually seems similar to an issue I encountered with ath10k where
the network interface would time out being brought up. Retrying would
succeed but the driver would be in a similar state where IWD could
authenticate/associate but no data frames (i.e. 4-way handshake) would
be passed to userspace. Only solution (until upstream fixed the bug) was
to unload/reload the driver when we detected this condition.
If you are able to physically attach to a device currently in this state
you may be able to get more info. For example if IWD is stuck like this
try disconnecting/reconnecting with iwctl or restarting IWD to see what
happens. If you end up in the same state right away I'm 99.9% sure the
driver is the entire reason your running into this.
Thanks,
James
next prev parent reply other threads:[~2024-11-04 12:36 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 10:12 Connection loss (IWD HEAD with latest OWE / BSS selection patches) - brcmfmac driver Martin Petzold
2024-10-25 11:10 ` Martin Petzold
2024-10-25 11:48 ` James Prestwood
2024-10-25 12:01 ` Martin Petzold
2024-10-25 12:28 ` Martin Petzold
2024-10-25 12:33 ` Martin Petzold
2024-10-25 12:39 ` Martin Petzold
2024-10-25 12:48 ` Martin Petzold
2024-10-25 12:54 ` James Prestwood
2024-10-25 13:05 ` Martin Petzold
2024-10-25 13:17 ` James Prestwood
2024-10-25 13:11 ` Martin Petzold
2024-10-25 13:18 ` James Prestwood
2024-10-25 15:03 ` Martin Petzold
2024-10-25 15:17 ` James Prestwood
2024-10-25 22:22 ` Martin Petzold
2024-10-26 10:01 ` Martin Petzold
2024-10-26 8:26 ` Arend Van Spriel
2024-11-03 23:13 ` Martin Petzold
2024-11-04 0:43 ` Martin Petzold
2024-11-04 12:36 ` James Prestwood [this message]
2024-11-04 22:42 ` Martin Petzold
2024-11-04 23:20 ` James Prestwood
2024-11-05 8:03 ` Martin Petzold
2024-11-05 13:14 ` James Prestwood
2024-11-05 15:16 ` Martin Petzold
2024-11-12 9:15 ` Martin Petzold
2024-11-12 12:13 ` James Prestwood
2024-11-07 13:09 ` Martin Petzold
2024-11-06 20:32 ` Martin Petzold
2024-11-06 21:35 ` James Prestwood
2024-10-25 15:17 ` Martin Petzold
2024-10-26 9:07 ` Arend Van Spriel
2024-10-26 10:08 ` Martin Petzold
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f489d856-c788-47c5-8739-325a11d7ff42@gmail.com \
--to=prestwoj@gmail.com \
--cc=arend.vanspriel@broadcom.com \
--cc=iwd@lists.linux.dev \
--cc=martin.petzold@tavla.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox