public inbox for iwd@lists.linux.dev
 help / color / mirror / Atom feed
From: James Prestwood <prestwoj@gmail.com>
To: Martin Petzold <martin.petzold@tavla.de>
Cc: iwd@lists.linux.dev, Arend Van Spriel <arend.vanspriel@broadcom.com>
Subject: Re: Connection loss (IWD HEAD with latest OWE / BSS selection patches) - brcmfmac driver
Date: Fri, 25 Oct 2024 08:17:11 -0700	[thread overview]
Message-ID: <a64c4733-9011-448c-ab75-8916ff339ac4@gmail.com> (raw)
In-Reply-To: <78437e50-e6b1-4965-bd03-776fcf3c9801@tavla.de>


On 10/25/24 8:03 AM, Martin Petzold wrote:
> Hi James,
>
> Am 25.10.24 um 15:18 schrieb James Prestwood:
>>
>> On 10/25/24 6:11 AM, Martin Petzold wrote:
>>> Hi James,
>>>
>>> Am 25.10.24 um 14:54 schrieb James Prestwood:
>>>> Hi Martin,
>>>>
>>>> On 10/25/24 5:28 AM, Martin Petzold wrote:
>>>>> Hi James,
>>>>>
>>>>> Am 25.10.24 um 13:48 schrieb James Prestwood:
>>>>>> Hi Martin,
>>>>>>
>>>>>> On 10/25/24 4:10 AM, Martin Petzold wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Am 25.10.24 um 12:12 schrieb Martin Petzold:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I open a new thread for this one: During the last weeks I have 
>>>>>>>> seen connection losses for 30+ minutes, sometimes even hours or 
>>>>>>>> just now even forever (IWD HEAD with v2 OWE / BSS selection 
>>>>>>>> patches). Driver is brcmfmac (NXP 6.1.36 kernel) and chip is 
>>>>>>>> BCM4339 (Laird LWB5).
>>>>>>>>
>>>>>>>> It happens in a) single router environment (WPA2-PSK; 
>>>>>>>> Touchstone TG3442DE), and b) router + repeater environment 
>>>>>>>> (WPA2 CCMP; Fritz!Box + Fritz!Repeater), and maybe also in the 
>>>>>>>> WPA3 OWE Transition network (yesterday lost a connection again).
>>>>>>>
>>>>>>> I lost now again 2 of 10 devices in the WPA3 OWE network (with 
>>>>>>> roaming). However, now they don't disappear all after a shorter 
>>>>>>> while. It seems to be later.
>>>>>>>
>>>>>>> I also lost one device in a Router+Repeater WPA2 (CCMP) network. 
>>>>>>> It is confirmed here on router side, that the device is 
>>>>>>> disconnected. Since more than a day.
>>>>>>
>>>>>> We can't do anything without logs. If you suspect its the 
>>>>>> blacklist you can lower the blacklist time down in main.conf:
>>>>>>
>>>>>> [Blacklist]
>>>>>> MaximumTimeout=5
>>>>>>
>>>>>> But logs would be best, we would be able to see if that is whats 
>>>>>> happening or if its something else.
>>>>>>
>>>>>>
>>>>> I can also now see this on one of our local devices (which was 
>>>>> connected to Ethernet and WiFi).
>>>>>
>>>>> Please find attached the log. The logs start at late 23rd with the 
>>>>> image upgrade.
>>>>>
>>>>> I can see on our server side, that devices continuously connect 
>>>>> and disconnect again (missed heartbeat = hard loss of websocket 
>>>>> connection). I see some type of connection loop on client side too 
>>>>> ("org.eclipse.jetty.websocket.api.UpgradeException: 0 null" caused 
>>>>> by "java.util.concurrent.TimeoutException: DNS timeout 15000 ms"). 
>>>>> Maybe it alternates between Ethernet and WiFi. You will understand 
>>>>> better. Or the network is re-configured all the time.
>>>>>
>>>>> Here is my local status:
>>>>>
>>>>> tavla@tavla:~$ iwctl station wlan0 show
>>>>>                                  Station: wlan0
>>>>> -------------------------------------------------------------------------------- 
>>>>>
>>>>>   Settable  Property Value
>>>>> -------------------------------------------------------------------------------- 
>>>>>
>>>>>             Scanning no
>>>>>             State connecting
>>>>>             Connected network     XYZ
>>>>>             No IP addresses       Is DHCP client configured?
>>>>>
>>>>> tavla@tavla:~$ networkctl status
>>>>> ●        State: routable
>>>>>   Online state: unknown
>>>>>        Address: 192.168.178.178 on eth0
>>>>>                 2a0a:a549:da80:0:fadc:7aff:fe67:2e4 on eth0
>>>>>                 fe80::fadc:7aff:fe67:2e4 on eth0
>>>>>                 fe80::c2ee:40ff:fe8a:dd62 on wlan0
>>>>>        Gateway: 192.168.178.1 on eth0
>>>>>                 fe80::3a10:d5ff:fe37:2c79 on eth0
>>>>>            DNS: 192.168.178.1
>>>>>            NTP: 192.168.178.1
>>>>> IDX LINK  TYPE     OPERATIONAL SETUP
>>>>>   1 lo    loopback carrier     unmanaged
>>>>>   2 eth0  ether    routable    configured
>>>>>   3 wlan0 wlan     no-carrier  unmanaged
>>>>>
>>>>> 3 links listed.
>>>>
>>>> Yep, IWD is ultimately hung up waiting for a connect event from the 
>>>> kernel/driver. A few things I noticed prior to that:
>>>>
>>>> 1. Not really critical but you have an agent connecting and 
>>>> disconnecting multiple times a second. Are you polling info with 
>>>> iwctl or something?
>
> What do you mean with an agent? I have no iwctl agent (there is only 
> web socket connection agent). Could this be something to check (also 
> on your side)?
Okt 23 23:00:48 tavla iwd[384]: src/agent.c:agent_register() agent 
register called
Okt 23 23:00:48 tavla iwd[384]: src/agent.c:agent_register() agent 
:1.179 path /agent/3420
Okt 23 23:00:48 tavla iwd[384]: src/agent.c:agent_disconnect() agent 
:1.179 disconnected
Okt 23 23:00:48 tavla iwd[384]: src/agent.c:agent_free() agent free 
0xaaaae76dec10

Something is registering itself with IWD, then unregistering itself over 
and over again every minute. It may not be iwctl but something is doing 
that. Its not going to cause problems but its just very odd.

>>>> [...]
>>> Okt 25 15:08:56 tavla iwd[384]: event: state, old: connecting 
>>> (auto), new: disconnecting
>>> Okt 25 15:08:56 tavla iwd[384]: src/wiphy.c:wiphy_radio_work_done() 
>>> Work item 19 done
>>> Okt 25 15:08:56 tavla iwd[384]: src/station.c:station_connect_cb() 
>>> 3, result: 5
>>> Okt 25 15:08:56 tavla iwd[384]: 
>>> src/station.c:station_disconnect_cb() 3, success: 0
>>> Okt 25 15:08:56 tavla iwd[384]: event: state, old: disconnecting, 
>>> new: disconnected
>>> Okt 25 15:08:56 tavla iwd[384]: src/agent.c:agent_disconnect() agent 
>>> :1.13740 disconnected
>>> Okt 25 15:08:56 tavla iwd[384]: src/agent.c:agent_free() agent free 
>>> 0xaaaae76eaa50
>> Yes, the driver seems completely stuck. I would recommend 
>> unloading/reloading it with modprobe.
>
> I'm quite sure this is related to your BSS / AKM patch. It happened on 
> a client here after an image upgrade to this version. I have never 
> seen this before.
For WPA2 those patches _should_ have zero effect. Based on your logs the 
behavior here is entirely different than with OWE. Before the connection 
to the OWE network was actually failing (failed within brcmfmac). Here 
IWD tried to connect and got no event back from the driver. To know this 
without a doubt we would need to to run iwmon and see the communication 
between IWD and the kernel.
>
> Also, I have debugging for BRCM enabled (CONFIG_BRCMDBG=y and 
> CONFIG_DEBUG_FS=y). It sounds strange if there is no output, even not 
> before.

Maybe Arend has some idea of why userspace could send a CMD_CONNECT and 
get no associated CMD_CONNECT event?

The fact that even manually connecting didn't work after this happened 
tells me the driver is "stuck". Did you try cycling the driver with 
modprobe?

>
> Thanks,
>
> Martin
>

  reply	other threads:[~2024-10-25 15:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-25 10:12 Connection loss (IWD HEAD with latest OWE / BSS selection patches) - brcmfmac driver Martin Petzold
2024-10-25 11:10 ` Martin Petzold
2024-10-25 11:48   ` James Prestwood
2024-10-25 12:01     ` Martin Petzold
2024-10-25 12:28     ` Martin Petzold
2024-10-25 12:33       ` Martin Petzold
2024-10-25 12:39       ` Martin Petzold
2024-10-25 12:48       ` Martin Petzold
2024-10-25 12:54       ` James Prestwood
2024-10-25 13:05         ` Martin Petzold
2024-10-25 13:17           ` James Prestwood
2024-10-25 13:11         ` Martin Petzold
2024-10-25 13:18           ` James Prestwood
2024-10-25 15:03             ` Martin Petzold
2024-10-25 15:17               ` James Prestwood [this message]
2024-10-25 22:22                 ` Martin Petzold
2024-10-26 10:01                   ` Martin Petzold
2024-10-26  8:26                 ` Arend Van Spriel
2024-11-03 23:13                 ` Martin Petzold
2024-11-04  0:43                   ` Martin Petzold
2024-11-04 12:36                   ` James Prestwood
2024-11-04 22:42                     ` Martin Petzold
2024-11-04 23:20                       ` James Prestwood
2024-11-05  8:03                         ` Martin Petzold
2024-11-05 13:14                         ` James Prestwood
2024-11-05 15:16                           ` Martin Petzold
2024-11-12  9:15                             ` Martin Petzold
2024-11-12 12:13                               ` James Prestwood
2024-11-07 13:09                         ` Martin Petzold
2024-11-06 20:32                     ` Martin Petzold
2024-11-06 21:35                       ` James Prestwood
2024-10-25 15:17         ` Martin Petzold
2024-10-26  9:07           ` Arend Van Spriel
2024-10-26 10:08             ` Martin Petzold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a64c4733-9011-448c-ab75-8916ff339ac4@gmail.com \
    --to=prestwoj@gmail.com \
    --cc=arend.vanspriel@broadcom.com \
    --cc=iwd@lists.linux.dev \
    --cc=martin.petzold@tavla.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox