public inbox for linux-wireless@vger.kernel.org
 help / color / mirror / Atom feed
From: "Antoine Beaupré" <anarcat@debian.org>
To: Johannes Berg <johannes@sipsolutions.net>,
	linux-wireless@vger.kernel.org
Cc: Gregory Greenman <gregory.greenman@intel.com>, ilan.peer@intel.com
Subject: Re: Microcode SW error since Linux 6.5
Date: Sun, 24 Sep 2023 22:43:21 -0400	[thread overview]
Message-ID: <87jzsf9dme.fsf@angela.anarc.at> (raw)
In-Reply-To: <60e2c052f3cedc5c80964e4be90c50cdaa899a87.camel@sipsolutions.net>

On 2023-09-21 21:29:27, Johannes Berg wrote:
> On Thu, 2023-09-21 at 13:24 -0400, Antoine Beaupré wrote:
>> Hi,
>> 
>> I've found what I feel might be a regression between Linux 6.1 and
>> 6.5. For other reasons, I upgraded the kernel on my Debian 12
>> ("bookworm", stale) laptop from the distribution 6.1.52 to the unstable
>> ("sid") version, 6.5.3.
>> 
>> After the upgrade, I started to notice stuttering in my audio player, I
>> tracked it down and managed to correlate it with some kernel errors
>> related to the iwlwifi driver.
>> 
>> What's interesting is that this happens regardless of whether or not the
>> NIC is connected to a network. In at least one of the traces, the
>> computer was connected over a wire and wireless was not associated in
>> Network Manager.
>
> This happens when scanning.

Ah, that makes sense!

>> Here's an example of the problem:
>> 
>> sep 21 09:33:14 angela kernel: iwlwifi 0000:a6:00.0: Microcode SW error detected. Restarting 0x0.
>
> Can you give a few wpa_supplicant lines (there were some below) above
> this? Just want to make sure it really is scanning on wlan0, not
> something with P2P device.

Interestingly, for the above fault, there's no wpa_supplicant line just
*before*. There's this *after*:

sep 21 09:33:14 angela wpa_supplicant[1563]: wlan0: CTRL-EVENT-SCAN-FAILED ret=-5
sep 21 09:33:15 angela kernel: iwlwifi 0000:a6:00.0: WFPM_UMAC_PD_NOTIFICATION: 0x1f
sep 21 09:33:15 angela kernel: iwlwifi 0000:a6:00.0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
sep 21 09:33:15 angela kernel: iwlwifi 0000:a6:00.0: WFPM_AUTH_KEY_0: 0x80
sep 21 09:33:15 angela kernel: iwlwifi 0000:a6:00.0: CNVI_SCU_SEQ_DATA_DW9: 0x0
sep 21 09:33:15 angela wpa_supplicant[1563]: wlan0: CTRL-EVENT-REGDOM-CHANGE init=DRIVER type=WORLD

But an earlier one is preceeded by:

sep 21 09:32:45 angela wpa_supplicant[1563]: wlan0: CTRL-EVENT-SCAN-FAILED ret=-5
sep 21 09:32:45 angela kernel: iwlwifi 0000:a6:00.0: Microcode SW error detected. Restarting 0x0.
[...]

>> sep 21 09:33:14 angela kernel: iwlwifi 0000:a6:00.0: 0x20103600 | ADVANCED_SYSASSERT
>
>> sep 21 09:33:14 angela kernel: iwlwifi 0000:a6:00.0: 0x000000FF | umac data1
>
> This means that somehow scan_start_mac_or_link_id in the driver ended up
> 0xff which is invalid, but I'm not sure I see immediately how that
> happened, since it looks like in 6.5.3 we do assign it reasonably. I
> guess somehow in the code link_info->fw_link_id must be 0xff (invalid
> ID), but I'm not sure I see how that could happen.
>
> *thinks*
>
> Oh.. This is an older firmware, so it doesn't have
> IWL_UCODE_TLV_CAPA_MLD_API_SUPPORT! Hah. I feel like I had some concerns
> in this area before ... but maybe the other way around.
>
> I think something like this, perhaps:
>
> --- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
> +++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
> @@ -2342,7 +2342,7 @@ iwl_mvm_scan_umac_fill_general_p_v12(struct iwl_mvm *mvm,
>  	if (gen_flags & IWL_UMAC_SCAN_GEN_FLAGS_V2_FRAGMENTED_LMAC2)
>  		gp->num_of_fragments[SCAN_HB_LMAC_IDX] = IWL_SCAN_NUM_OF_FRAGS;
>  
> -	if (version < 12) {
> +	if (version < 12 || !iwl_mvm_has_mld_api(mvm->fw)) {
>  		gp->scan_start_mac_or_link_id = scan_vif->id;
>  	} else {
>  		struct iwl_mvm_vif_link_info *link_info;

Interesting! In any case, the firmware is certainly out of date in
Debian stable, and I guess it's to be expected that having it out of
sync with the running kernel is a Bad Idea, it's just not something I've
thought of before. :)

Thanks for the debugging, I'll make sure to keep the firmware and kernel
in better lockstep in the future!

a.

-- 
Lorsque l'on range des objets dans des tiroirs, et que l'on a plus
d'objets que de tiroirs, alors un tiroir au moins contient deux
objets.
                        - Lejeune-Dirichlet, Peter Gustav

  reply	other threads:[~2023-09-25  2:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-21 17:24 Microcode SW error since Linux 6.5 Antoine Beaupré
2023-09-21 19:10 ` Greenman, Gregory
2023-09-21 19:29 ` Johannes Berg
2023-09-25  2:43   ` Antoine Beaupré [this message]
2023-09-25  6:03     ` Johannes Berg
2023-09-25 18:39       ` Antoine Beaupré
2023-09-25 19:03         ` Johannes Berg
2023-10-03 11:52           ` Linux regression tracking (Thorsten Leemhuis)
2023-10-04  7:51             ` Greenman, Gregory
2023-10-04  8:35               ` Linux regression tracking (Thorsten Leemhuis)
2023-10-04 13:12                 ` Kalle Valo
2023-10-04 15:34                 ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87jzsf9dme.fsf@angela.anarc.at \
    --to=anarcat@debian.org \
    --cc=gregory.greenman@intel.com \
    --cc=ilan.peer@intel.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox