Linux wireless drivers development
 help / color / mirror / Atom feed
* Re: [PATCH 01/61] Coccinelle: Prefer IS_ERR_OR_NULL over manual NULL check
From: Krzysztof Kozlowski @ 2026-04-16 12:30 UTC (permalink / raw)
  To: Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel,
	dri-devel, gfs2, intel-gfx, intel-wired-lan, iommu, kvm,
	linux-arm-kernel, linux-block, linux-bluetooth, linux-btrfs,
	linux-cifs, linux-clk, linux-erofs, linux-ext4, linux-fsdevel,
	linux-gpio, linux-hyperv, linux-input, linux-kernel, linux-leds,
	linux-media, linux-mips, linux-mm, linux-modules, linux-mtd,
	linux-nfs, linux-omap, linux-phy, linux-pm, linux-rockchip,
	linux-s390, linux-scsi, linux-sctp, linux-security-module,
	linux-sh, linux-sound, linux-stm32, linux-trace-kernel, linux-usb,
	linux-wireless, netdev, ntfs3, samba-technical, sched-ext,
	target-devel, tipc-discussion, v9fs
  Cc: Julia Lawall, Nicolas Palix
In-Reply-To: <20260310-b4-is_err_or_null-v1-1-bd63b656022d@avm.de>

On 10/03/2026 12:48, Philipp Hahn wrote:
> Find and convert uses of IS_ERR() plus NULL check to IS_ERR_OR_NULL().
> 
> There are several cases where `!ptr && WARN_ON[_ONCE](IS_ERR(ptr))` is
> used:
> - arch/x86/kernel/callthunks.c:215 WARN_ON_ONCE
> - drivers/clk/clk.c:4561 WARN_ON_ONCE
> - drivers/interconnect/core.c:793 WARN_ON
> - drivers/reset/core.c:718 WARN_ON
> The change is not 100% semantical equivalent as the warning will now
> also happen when the pointer is NULL.
> 
> To: Julia Lawall <Julia.Lawall@inria.fr>
> To: Nicolas Palix <nicolas.palix@imag.fr>
> Cc: cocci@inria.fr
> Cc: linux-kernel@vger.kernel.org
> 
> ---
> drivers/clocksource/mips-gic-timer.c:283 looks suspicious: ret != clk,
> but Daniel Lezcano verified it as cottect.
> 
> There are some cases where the checks are part of a larger expression:
> - mm/kmemleak.c:1095
> - mm/kmemleak.c:1155
> - mm/kmemleak.c:1173
> - mm/kmemleak.c:1290
> - mm/kmemleak.c:1328
> - mm/kmemleak.c:1241
> - mm/kmemleak.c:1310
> - mm/kmemleak.c:1258
> - net/netlink/af_netlink.c:2670
> Thanks to Julia Lawall for the help to also handle them.
> 
> Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
> ---
>  scripts/coccinelle/api/is_err_or_null.cocci | 125 ++++++++++++++++++++++++++++
>  1 file changed, 125 insertions(+)
> 

Neither this, nor try from 2011, nor any future try should be accepted,
because it creates impression IS_ERR_OR_NULL is somehow okay. No, it is
not okay, it is a discouraged pattern leading to less readable and
maintainable code. We should not have therefore any tools suggesting
usage of IS_ERR_OR_NULL, because people will be converting poor code
into that, instead of fixing that poor code.

Best regards,
Krzysztof

^ permalink raw reply

* Re: Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Pablo MARTIN-GOMEZ @ 2026-04-16 11:47 UTC (permalink / raw)
  To: Benson Bear; +Cc: linux-wireless
In-Reply-To: <CACM6vn6UXfSXw2WpYvzF+ODPGHw-LtsBMgtvc6n7s9iF9eaS6Q@mail.gmail.com>

On 16/04/2026 13:03, Benson Bear wrote:
> Hi Pablo, thanks for your really prompt reply.   And sorry
> for the spelling error in the Subject header ("Mps" for "Mbps")
> and the horrid line formatting.  (Not used to using short lines
> although that is what I grew up on).
> 
> And... you got it!   The MFP flag is lacking, and googling
> showed me that instead of  messing with wpa_supplicant, I
> can apparently do the same with nmcli:
> 
> nmcli connection modify NAME 802-11-wireless-security.pmf disable
Oh yeah, I didn't check nmcli documentation thoroughly enough to find
that option. Quite easier to implement than going the raw wpa_supplicant
way.
> 
> I tried that and it worked!   Internet speed test back to 600Mbps!
> What a relief!  Thank you very much!
> 
> I will try other testing with just pure Wi-Fi and with all machines
> after I get some sleep.
> 
> Can you tell me why this is likely to have happened?  Surely
> one side or the other is misconfigured?  This misunderstanding
> between them should not be possible within a good specification,
> right?
Given that your client does not have the MFP flag and you can connect
without PMF, that means that your AP advertise MFP Capable (and so is
your client when it is not disabled), and following the association +
4-way handshake, the AP believes it has correctly negotiated MFP but not
your client, so the AP is sending the client encrypted action framed
that are dropped by the client and the client is sending non-encrypted
action frames that are refused by the AP. The easiest way to debug this
would be to capture over the air the auth + assoc + 4-way handshake +
action frame and provide the SSID + the PSK to be able to decrypt
everything and understand who is in the wrong. If it's an issue on the
client side, it is most probably an issue in wpa_supplicant and not in
the kernel.
> 
> (Sadly I think I might have an idea -- it's partly my fault.   The
> mac80211 module was disabling all HT and above because it
> felt it could not meet the BCS criteria laid down by the AP.
> Many people have thought these criteria were way too onerous.
> So I first had very low speeds because of that.  No HT even.
> I applied a patch to the module that ignored these requirements
> and got back to a high connection speed, and HE or VHT
> enabled, and got back a little of the lost speed.   Clearly very
> kludgy but seems a legitimate response to what the patch
> author called "aggressive basic MCS rates". But it may
> have opened up the room for misunderstandings.  I have my
> speed back in practice, but I wonder what the "correct" way
> of fixing this would be -- without the kludgy module patch).
> 
> Thanks again!
> 


^ permalink raw reply

* Re: [PATCH 55/61] interconnect: Prefer IS_ERR_OR_NULL over manual NULL check
From: Krzysztof Kozlowski @ 2026-04-16 12:24 UTC (permalink / raw)
  To: Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel,
	dri-devel, gfs2, intel-gfx, intel-wired-lan, iommu, kvm,
	linux-arm-kernel, linux-block, linux-bluetooth, linux-btrfs,
	linux-cifs, linux-clk, linux-erofs, linux-ext4, linux-fsdevel,
	linux-gpio, linux-hyperv, linux-input, linux-kernel, linux-leds,
	linux-media, linux-mips, linux-mm, linux-modules, linux-mtd,
	linux-nfs, linux-omap, linux-phy, linux-pm, linux-rockchip,
	linux-s390, linux-scsi, linux-sctp, linux-security-module,
	linux-sh, linux-sound, linux-stm32, linux-trace-kernel, linux-usb,
	linux-wireless, netdev, ntfs3, samba-technical, sched-ext,
	target-devel, tipc-discussion, v9fs
  Cc: Georgi Djakov
In-Reply-To: <20260310-b4-is_err_or_null-v1-55-bd63b656022d@avm.de>

On 10/03/2026 12:49, Philipp Hahn wrote:
> Prefer using IS_ERR_OR_NULL() over using IS_ERR() and a manual NULL
> check.
> 
> Semantich change: Previously the code only printed the warning on error,
> but not when the pointer was NULL. Now the warning is printed in both
> cases!

NAK, read the code

> 
> Change found with coccinelle.
> 
> To: Georgi Djakov <djakov@kernel.org>
> Cc: linux-pm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
> ---
>  drivers/interconnect/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
> index 8569b78a18517b33abeafac091978b25cbc1acc7..22e92b30f73853d5bd2e05b4f52cb5aa22556468 100644
> --- a/drivers/interconnect/core.c
> +++ b/drivers/interconnect/core.c
> @@ -790,7 +790,7 @@ void icc_put(struct icc_path *path)
>  	size_t i;
>  	int ret;
>  
> -	if (!path || WARN_ON(IS_ERR(path)))
> +	if (WARN_ON(IS_ERR_OR_NULL(path)))

IS_ERR_OR_NULL is simply discouraged, but beside of code preference, you
just added bug here. This is clearly not equivalent and you emit warn on
perfectly valid case!

Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH wireless-next] wifi: mac80211: add __packed to union members of struct ieee80211_rx_status
From: Johannes Berg @ 2026-04-16 11:53 UTC (permalink / raw)
  To: Ping-Ke Shih, linux-wireless@vger.kernel.org
In-Reply-To: <280094f50a534fc998037b21c36ebe11@realtek.com>

On Tue, 2026-04-14 at 00:55 +0000, Ping-Ke Shih wrote:
> 
> > Because of size assertion of rtw88's efuse map [1], I found
> > arm-linux-gnueabi-gcc compiler throws this warning, but x86 gcc is absolutely
> > silent and expected without this patch.

Yeah, depends on ABI padding rules.

> > [1]
> > https://lore.kernel.org/linux-wireless/7c65e315-5a2e-455e-87ee-8fc6d60ed807@gmail.com/T/#m43fdf8a1
> > c2b8cff92c1ef50faab7993522647162
> 
> I'd note that discussion thread [2] of original kernel test robot.
> Arnd pointed out the cause is CONFIG_AEABI is not set, and he said
> nobody should be using ARM OABI any more. 

Ah, right.

> Maybe, we can ignore the CPU and skip this patch.

Given that nobody complained in many years about not being able to use
wifi on those machines, I'd be inclined to just do that, yeah. But
thanks for digging into it!

johannes

^ permalink raw reply

* Re: [patch 05/38] treewide: Remove CLOCK_TICK_RATE
From: Geert Uytterhoeven @ 2026-04-16 11:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Arnd Bergmann, x86, Lu Baolu, iommu, Michael Grzeschik,
	netdev, linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
	linux-mm, David Woodhouse, Bernie Thompson, linux-fbdev,
	Theodore Tso, linux-ext4, Andrew Morton, Uladzislau Rezki,
	Marco Elver, Dmitry Vyukov, kasan-dev, Andrey Ryabinin,
	Thomas Sailer, linux-hams, Jason A. Donenfeld, Richard Henderson,
	linux-alpha, Russell King, linux-arm-kernel, Catalin Marinas,
	Huacai Chen, loongarch, linux-m68k, Dinh Nguyen, Jonas Bonn,
	linux-openrisc, Helge Deller, linux-parisc, Michael Ellerman,
	linuxppc-dev, Paul Walmsley, linux-riscv, Heiko Carstens,
	linux-s390, David S. Miller, sparclinux
In-Reply-To: <20260410120317.910770161@kernel.org>

On Fri, 10 Apr 2026 at 14:18, Thomas Gleixner <tglx@kernel.org> wrote:
> This has been scheduled for removal more than a decade ago and the comments
> related to it have been dutifully ignored. The last dependencies are gone.
>
> Remove it along with various now empty asm/timex.h files.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

>  arch/m68k/include/asm/timex.h       |   15 ---------------

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [patch 27/38] m68k: Select ARCH_HAS_RANDOM_ENTROPY
From: Geert Uytterhoeven @ 2026-04-16 11:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, linux-m68k, Arnd Bergmann, x86, Lu Baolu, iommu,
	Michael Grzeschik, netdev, linux-wireless, Herbert Xu,
	linux-crypto, Vlastimil Babka, linux-mm, David Woodhouse,
	Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
	Andrew Morton, Uladzislau Rezki, Marco Elver, Dmitry Vyukov,
	kasan-dev, Andrey Ryabinin, Thomas Sailer, linux-hams,
	Jason A. Donenfeld, Richard Henderson, linux-alpha, Russell King,
	linux-arm-kernel, Catalin Marinas, Huacai Chen, loongarch,
	Dinh Nguyen, Jonas Bonn, linux-openrisc, Helge Deller,
	linux-parisc, Michael Ellerman, linuxppc-dev, Paul Walmsley,
	linux-riscv, Heiko Carstens, linux-s390, David S. Miller,
	sparclinux
In-Reply-To: <20260410120319.397219631@kernel.org>

On Fri, 10 Apr 2026 at 14:20, Thomas Gleixner <tglx@kernel.org> wrote:
> The only remaining usage of get_cycles() is to provide
> random_get_entropy().
>
> Switch m68k over to the new scheme of selecting ARCH_HAS_RANDOM_ENTROPY and
> providing random_get_entropy() in asm/random.h.
>
> Remove asm/timex.h as it has no functionality anymore.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [patch 07/38] treewide: Consolidate cycles_t
From: Geert Uytterhoeven @ 2026-04-16 11:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Arnd Bergmann, x86, Lu Baolu, iommu, Michael Grzeschik,
	netdev, linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
	linux-mm, David Woodhouse, Bernie Thompson, linux-fbdev,
	Theodore Tso, linux-ext4, Andrew Morton, Uladzislau Rezki,
	Marco Elver, Dmitry Vyukov, kasan-dev, Andrey Ryabinin,
	Thomas Sailer, linux-hams, Jason A. Donenfeld, Richard Henderson,
	linux-alpha, Russell King, linux-arm-kernel, Catalin Marinas,
	Huacai Chen, loongarch, Dinh Nguyen, Jonas Bonn, linux-openrisc,
	Helge Deller, linux-parisc, Michael Ellerman, linuxppc-dev,
	Paul Walmsley, linux-riscv, Heiko Carstens, linux-s390,
	David S. Miller, sparclinux
In-Reply-To: <20260410120318.045532623@kernel.org>

On Fri, 10 Apr 2026 at 14:19, Thomas Gleixner <tglx@kernel.org> wrote:
> Most architectures define cycles_t as unsigned long execpt:
>
>  - x86 requires it to be 64-bit independent of the 32-bit/64-bit build.
>
>  - parisc and mips define it as unsigned int
>
>    parisc has no real reason to do so as there are only a few usage sites
>    which either expand it to a 64-bit value or utilize only the lower
>    32bits.
>
>    mips has no real requirement either.
>
> Move the typedef to types.h and provide a config switch to enforce the
> 64-bit type for x86.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

>  arch/m68k/include/asm/timex.h      |    2 --

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Benson Bear @ 2026-04-16 11:28 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Pablo MARTIN-GOMEZ, linux-wireless
In-Reply-To: <ee6a0a4e735b3e97fbe96a25d2af59ee0f663fc9.camel@sipsolutions.net>

On Thu, Apr 16, 2026 at 7:09 AM Johannes Berg <johannes@sipsolutions.net> wrote:

> I think what you're referring to here might be this?

> https://lore.kernel.org/linux-wireless/99Mv9QEceyPrQhSP52MtAVmz0_kWJmzqotJjD9YW6LGLqk-AZloAueUyHCURilFkuqOh6Ecv8i2KKdSE1ujP3AnbU5QEouVisT1w_V3xdfc=@r26.me/

Absolutely yes that is the issue, but the patch I found was different.

As far as I can tell that patch did not determine any specific
conditions under which the check could be bypassed, and this
one does.   And, I assume, it does so correctly?

 So the "correct" way forward that I was asking about has already
been found?  Excellent.

I will apply this patch myself and determine that it does the
job for me just as the other one did.

Here by the way is the one I used, from out in the wild:

https://github.com/WoodyWoodster/mac80211-mcs-patch

So I had two issues with the Rogers XB7 firmware.  They
seem now to be unrelated?   I think plausibly mac80211
could be said to have a bug in it, on the BCS issue, but
what about the issue I originally asked about?

^ permalink raw reply

* Re: Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Johannes Berg @ 2026-04-16 11:09 UTC (permalink / raw)
  To: Benson Bear, Pablo MARTIN-GOMEZ; +Cc: linux-wireless
In-Reply-To: <CACM6vn6UXfSXw2WpYvzF+ODPGHw-LtsBMgtvc6n7s9iF9eaS6Q@mail.gmail.com>

On Thu, 2026-04-16 at 07:03 -0400, Benson Bear wrote:
> 
> (Sadly I think I might have an idea -- it's partly my fault.   The
> mac80211 module was disabling all HT and above because it
> felt it could not meet the BCS criteria laid down by the AP.
> Many people have thought these criteria were way too onerous.
> So I first had very low speeds because of that.  No HT even.
> I applied a patch to the module that ignored these requirements
> and got back to a high connection speed, and HE or VHT
> enabled, and got back a little of the lost speed.   Clearly very
> kludgy but seems a legitimate response to what the patch
> author called "aggressive basic MCS rates". But it may
> have opened up the room for misunderstandings.  I have my
> speed back in practice, but I wonder what the "correct" way
> of fixing this would be -- without the kludgy module patch).

I think what you're referring to here might be this?

https://lore.kernel.org/linux-wireless/99Mv9QEceyPrQhSP52MtAVmz0_kWJmzqotJjD9YW6LGLqk-AZloAueUyHCURilFkuqOh6Ecv8i2KKdSE1ujP3AnbU5QEouVisT1w_V3xdfc=@r26.me/

johannes

^ permalink raw reply

* Re: Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Benson Bear @ 2026-04-16 11:03 UTC (permalink / raw)
  To: Pablo MARTIN-GOMEZ; +Cc: linux-wireless
In-Reply-To: <b1a7678d-8e87-444e-b38a-bb7aedcd4f30@eskapa.be>

Hi Pablo, thanks for your really prompt reply.   And sorry
for the spelling error in the Subject header ("Mps" for "Mbps")
and the horrid line formatting.  (Not used to using short lines
although that is what I grew up on).

And... you got it!   The MFP flag is lacking, and googling
showed me that instead of  messing with wpa_supplicant, I
can apparently do the same with nmcli:

nmcli connection modify NAME 802-11-wireless-security.pmf disable

I tried that and it worked!   Internet speed test back to 600Mbps!
What a relief!  Thank you very much!

I will try other testing with just pure Wi-Fi and with all machines
after I get some sleep.

Can you tell me why this is likely to have happened?  Surely
one side or the other is misconfigured?  This misunderstanding
between them should not be possible within a good specification,
right?

(Sadly I think I might have an idea -- it's partly my fault.   The
mac80211 module was disabling all HT and above because it
felt it could not meet the BCS criteria laid down by the AP.
Many people have thought these criteria were way too onerous.
So I first had very low speeds because of that.  No HT even.
I applied a patch to the module that ignored these requirements
and got back to a high connection speed, and HE or VHT
enabled, and got back a little of the lost speed.   Clearly very
kludgy but seems a legitimate response to what the patch
author called "aggressive basic MCS rates". But it may
have opened up the room for misunderstandings.  I have my
speed back in practice, but I wonder what the "correct" way
of fixing this would be -- without the kludgy module patch).

Thanks again!

^ permalink raw reply

* Re: [patch 18/38] lib/tests: Replace get_cycles() with ktime_get()
From: Geert Uytterhoeven @ 2026-04-16 10:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andrew Morton, Uladzislau Rezki, linux-mm, Arnd Bergmann,
	x86, Lu Baolu, iommu, Michael Grzeschik, netdev, linux-wireless,
	Herbert Xu, linux-crypto, Vlastimil Babka, David Woodhouse,
	Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
	Marco Elver, Dmitry Vyukov, kasan-dev, Andrey Ryabinin,
	Thomas Sailer, linux-hams, Jason A. Donenfeld, Richard Henderson,
	linux-alpha, Russell King, linux-arm-kernel, Catalin Marinas,
	Huacai Chen, loongarch, linux-m68k, Dinh Nguyen, Jonas Bonn,
	linux-openrisc, Helge Deller, linux-parisc, Michael Ellerman,
	linuxppc-dev, Paul Walmsley, linux-riscv, Heiko Carstens,
	linux-s390, David S. Miller, sparclinux
In-Reply-To: <20260410120318.794680738@kernel.org>

Hi Thomas,

On Fri, 10 Apr 2026 at 14:20, Thomas Gleixner <tglx@kernel.org> wrote:
> get_cycles() is the historical access to a fine grained time source, but it
> is a suboptimal choice for two reasons:
>
>    - get_cycles() is not guaranteed to be supported and functional on all
>      systems/platforms. If not supported or not functional it returns 0,
>      which makes benchmarking moot.
>
>    - get_cycles() returns the raw counter value of whatever the
>      architecture platform provides. The original x86 Time Stamp Counter
>      (TSC) was despite its name tied to the actual CPU core frequency.
>      That's not longer the case. So the counter value is only meaningful
>      when the CPU operates at the same frequency as the TSC or the value is
>      adjusted to the actual CPU frequency. Other architectures and
>      platforms provide similar disjunct counters via get_cycles(), so the
>      result is operations per BOGO-cycles, which is not really meaningful.
>
> Use ktime_get() instead which provides nanosecond timestamps with the
> granularity of the underlying hardware counter, which is not different to
> the variety of get_cycles() implementations.
>
> This provides at least understandable metrics, i.e. operations/nanoseconds,
> and is available on all platforms. As with get_cycles() the result might
> have to be put into relation with the CPU operating frequency, but that's
> not any different.
>
> This is part of a larger effort to remove get_cycles() usage from
> non-architecture code.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>

Thanks for your patch!

> --- a/lib/interval_tree_test.c
> +++ b/lib/interval_tree_test.c
> @@ -65,13 +65,13 @@ static void init(void)
>  static int basic_check(void)
>  {
>         int i, j;
> -       cycles_t time1, time2, time;
> +       ktime_t time1, time2, time;
>
>         printk(KERN_ALERT "interval tree insert/remove");
>
>         init();
>
> -       time1 = get_cycles();
> +       time1 = ktime_get();
>
>         for (i = 0; i < perf_loops; i++) {
>                 for (j = 0; j < nnodes; j++)
> @@ -80,11 +80,11 @@ static int basic_check(void)
>                         interval_tree_remove(nodes + j, &root);
>         }
>
> -       time2 = get_cycles();
> +       time2 = ktime_get();
>         time = time2 - time1;
>
>         time = div_u64(time, perf_loops);
> -       printk(" -> %llu cycles\n", (unsigned long long)time);
> +       printk(" -> %llu nsecs\n", (unsigned long long)time);

While cycles_t was unsigned long or long long, ktime_t is always s64,
so "%lld", and the cast can be dropped (everywhere).

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v3 3/8] wifi: ath10k: snoc: support powering on the device via pwrseq
From: Luca Weiss @ 2026-04-16 10:06 UTC (permalink / raw)
  To: Dmitry Baryshkov, Liam Girdwood, Mark Brown, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Bartosz Golaszewski,
	Marcel Holtmann, Luiz Augusto von Dentz, Jeff Johnson,
	Bjorn Andersson, Konrad Dybcio, Manivannan Sadhasivam, Vinod Koul,
	Balakrishna Godavarthi, Matthias Kaehlcke
  Cc: linux-arm-msm, linux-kernel, devicetree, linux-bluetooth,
	linux-wireless, ath10k, linux-pm, Krzysztof Kozlowski,
	Bartosz Golaszewski
In-Reply-To: <20260119-wcn3990-pwrctl-v3-3-948df19f5ec2@oss.qualcomm.com>

Hi Dmitry,

On Mon Jan 19, 2026 at 6:07 PM CET, Dmitry Baryshkov wrote:
> The WCN39xx family of WiFi/BT chips incorporates a simple PMU, spreading
> voltages over internal rails. Implement support for using powersequencer
> for this family of ATH10k devices in addition to using regulators.
>
> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
> ---
>  drivers/net/wireless/ath/ath10k/snoc.c | 53 ++++++++++++++++++++++++++++++++--
>  drivers/net/wireless/ath/ath10k/snoc.h |  3 ++
>  2 files changed, 53 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
> index b3f6424c17d3..f72f236fb9eb 100644
> --- a/drivers/net/wireless/ath/ath10k/snoc.c
> +++ b/drivers/net/wireless/ath/ath10k/snoc.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: ISC
>  /*
>   * Copyright (c) 2018 The Linux Foundation. All rights reserved.
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>   */
>  
>  #include <linux/bits.h>
> @@ -11,6 +12,7 @@
>  #include <linux/of_device.h>
>  #include <linux/platform_device.h>
>  #include <linux/property.h>
> +#include <linux/pwrseq/consumer.h>
>  #include <linux/regulator/consumer.h>
>  #include <linux/remoteproc/qcom_rproc.h>
>  #include <linux/of_reserved_mem.h>
> @@ -1023,10 +1025,14 @@ static int ath10k_hw_power_on(struct ath10k *ar)
>  
>  	ath10k_dbg(ar, ATH10K_DBG_SNOC, "soc power on\n");
>  
> -	ret = regulator_bulk_enable(ar_snoc->num_vregs, ar_snoc->vregs);
> +	ret = pwrseq_power_on(ar_snoc->pwrseq);
>  	if (ret)
>  		return ret;
>  
> +	ret = regulator_bulk_enable(ar_snoc->num_vregs, ar_snoc->vregs);
> +	if (ret)
> +		goto pwrseq_off;
> +
>  	ret = clk_bulk_prepare_enable(ar_snoc->num_clks, ar_snoc->clks);
>  	if (ret)
>  		goto vreg_off;
> @@ -1035,18 +1041,28 @@ static int ath10k_hw_power_on(struct ath10k *ar)
>  
>  vreg_off:
>  	regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
> +pwrseq_off:
> +	pwrseq_power_off(ar_snoc->pwrseq);
> +
>  	return ret;
>  }
>  
>  static int ath10k_hw_power_off(struct ath10k *ar)
>  {
>  	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
> +	int ret_seq = 0;
> +	int ret_vreg;
>  
>  	ath10k_dbg(ar, ATH10K_DBG_SNOC, "soc power off\n");
>  
>  	clk_bulk_disable_unprepare(ar_snoc->num_clks, ar_snoc->clks);
>  
> -	return regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
> +	ret_vreg = regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
> +
> +	if (ar_snoc->pwrseq)
> +		ret_seq = pwrseq_power_off(ar_snoc->pwrseq);
> +
> +	return ret_vreg ? : ret_seq;
>  }
>  
>  static void ath10k_snoc_wlan_disable(struct ath10k *ar)
> @@ -1762,7 +1778,38 @@ static int ath10k_snoc_probe(struct platform_device *pdev)
>  		goto err_release_resource;
>  	}
>  
> -	ar_snoc->num_vregs = ARRAY_SIZE(ath10k_regulators);
> +	/*
> +	 * devm_pwrseq_get() can return -EPROBE_DEFER in two cases:
> +	 * - it is not supposed to be used
> +	 * - it is supposed to be used, but the driver hasn't probed yet.
> +	 *
> +	 * There is no simple way to distinguish between these two cases, but:
> +	 * - if it is not supposed to be used, then regulator_bulk_get() will
> +	 *   return all regulators as expected, continuing the probe
> +	 * - if it is supposed to be used, but wasn't probed yet, we will get
> +	 *   -EPROBE_DEFER from regulator_bulk_get() too.
> +	 *
> +	 * For backwards compatibility with DTs specifying regulators directly
> +	 * rather than using the PMU device, ignore the defer error from
> +	 * pwrseq.
> +	 */
> +	ar_snoc->pwrseq = devm_pwrseq_get(&pdev->dev, "wlan");
> +	if (IS_ERR(ar_snoc->pwrseq)) {
> +		ret = PTR_ERR(ar_snoc->pwrseq);
> +		ar_snoc->pwrseq = NULL;
> +		if (ret != -EPROBE_DEFER)
> +			goto err_free_irq;

I'm fairly sure this is now broken with CONFIG_POWER_SEQUENCING=n since
then pwrseq_get() is returning ERR_PTR(-ENOSYS) which is not handled
here.

I'm observing my ath10k_snoc is now failing to probe "with error -38"
which definitely seems to be related, but I haven't debugged it further
yet.

Regards
Luca

^ permalink raw reply

* Re: Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Pablo MARTIN-GOMEZ @ 2026-04-16  9:39 UTC (permalink / raw)
  To: Benson Bear, linux-wireless
In-Reply-To: <CACM6vn7QGKQcR5Rs=wmzNA-SgMDZX4Hw=UiPQHfYkWgLURcbAA@mail.gmail.com>

Hello,
On 16/04/2026 10:47, Benson Bear wrote:
> Hi folks, I've never posted here before, don't know much about wireless, but
> am having a big problem I have been trying to solve for a week. I've been
> googling and ai chatting non-stop but finally after reading the info
> page about the
> list figured it would probably be acceptable to send this message.
> 
> BRIEFEST SUMMARY: There was a firmware update in Rogers's (Canada) XB7 Gateway,
> and subsequently my Wi-Fi transfer speeds degraded badly on all three
> Linux notebooks I have. Fully up to date notebooks, running Fedora 43 and
> 42 with most recent kernel 6.19.11. Two different NICs: RTL8852BE and Intel
> 7265 (rev 59). Wired machines and phones are unaffected.
> 
> The machines all connect with high transfer rates of around
> 800-1000Mbs on the 5G band,
> with an 80Mhz wide channnel, and MCS level ranging from 7 to 11 (HE and VHT).
> 
> Transfer speeds using WPA2 security have dropped in the one case (RTL)
> from 600Mbps to
> about 30Mbps. (Using internet speed test but iperf3 gives similar). The other
> cases are similar.
When traffic is capped around 30-50Mbps, the usual suspect is
aggregation not being setup.
> 
> BUT the transfer using no network security is still what it used to
> be! It is simply
> the enabling of WPA2 that brings them to their knees.
If I had to guess, this is an issue with PMF. Either the STA or the AP
considers PMF is activated and the other one not; so the action frames
that set up a BA session are dropped.

Check
`/sys/kernel/debug/ieee80211/phy<n>/netdev\:<devname>/stations/<bssid>/flags`
on your notebooks if there is `MFP` in the flags
> 
> So it seems to be a problem related to WPA2, and at a lower level in the
> stack of modules, since it happens on two different NICs?
You can try a few things:
- build a master wpa_supplicant from source and replace the Fedora's
binaries
- use a raw wpa_supplicant connection and set ieee80211w=0 in the config
file
- switch the backend of NetworkManager to iwd
- update the security to WPA3
> 
> I suspected for a long time that it was a firmware bug in the gateway, but
> now I am starting to wonder. I have no solid evidence of that except that
> Windows works fine on the same gateway and the same machine.
> 
> All three machines work well on another network I have occasional access
> to, and have worked fine on this network until about a week ago.
> 
> I have ordered another router that I hope I can use to solve the
> immediate practical problem, but I would really like to figure out
> what is going on and contribute what I can to fixing it, even if only
> by being sent out to gather potentially useful data.
> 
> Thank you.
> 

Pablo MG

^ permalink raw reply

* [PATCH] wifi: brcmfmac: Fix potential use-after-free issue when stopping watchdog task
From: Marek Szyprowski @ 2026-04-16  9:33 UTC (permalink / raw)
  To: linux-wireless, brcm80211, brcm80211-dev-list.pdl
  Cc: Marek Szyprowski, Arend van Spriel
In-Reply-To: <CGME20260416093428eucas1p2fde898f84c1e15dd94d1ecb52707c72b@eucas1p2.samsung.com>

Watchdog task might end between send_sig() and kthread_stop() calls, what
results in the use-after-free issue. Fix this by increasing watchdog task
reference count before calling send_sig() and dropping it by switching to
kthread_stop_put().

Fixes: 373c83a801f1 ("brcmfmac: stop watchdog before detach and free everything")
Fixes: a9ffda88be74 ("brcm80211: fmac: abstract bus_stop interface function pointer")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
---
This fixes the following, rarely observed issue when no firmware binary
is available:

brcmfmac mmc2:0001:1: Direct firmware load for brcm/brcmfmac4330-sdio.txt failed with error -2
------------[ cut here ]------------
------------[ cut here ]------------
WARNING: kernel/fork.c:781 at __put_task_struct+0x13c/0x140, CPU#0: kworker/0:1/10
Modules linked in: brcmfmac hci_uart btbcm btintel bluetooth sha256 cfg80211 s5p_csis s5p_fimc s5p_mfc exynos4_is_common v4l2_fwnode v4l2_async ecdh_generic ecc s5p_jpeg videobuf2_dma_contig v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev brcmutil mc
CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 7.0.0-rc6-next-20260402 #12549 PREEMPT 
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events request_firmware_work_func
Call trace: 
 unwind_backtrace from show_stack+0x10/0x14
 show_stack from dump_stack_lvl+0x68/0x88
 dump_stack_lvl from __warn+0x94/0x204
 __warn from warn_slowpath_fmt+0x1b0/0x1bc
 warn_slowpath_fmt from __put_task_struct+0x13c/0x140
 __put_task_struct from rcu_core+0x330/0x1220
 rcu_core from handle_softirqs+0x130/0x5b0
 handle_softirqs from __irq_exit_rcu+0x144/0x1f0
 __irq_exit_rcu from irq_exit+0x8/0x28
 irq_exit from call_with_stack+0x18/0x20
 call_with_stack from __irq_svc+0x9c/0xd0
Exception stack(0xe0891c20 to 0xe0891c68)
...
 __irq_svc from console_flush_one_record+0x394/0x570
 console_flush_one_record from console_unlock+0x78/0x148
 console_unlock from vprintk_emit+0x224/0x390
 vprintk_emit from vprintk_default+0x20/0x28
 vprintk_default from _printk+0x2c/0x5c
 _printk from warn_slowpath_fmt+0xe8/0x1bc
 warn_slowpath_fmt from kthread_stop+0x2bc/0x364
 kthread_stop from brcmf_sdio_remove+0x2c/0x194 [brcmfmac]
 brcmf_sdio_remove [brcmfmac] from brcmf_sdiod_remove+0x20/0xb8 [brcmfmac]
 brcmf_sdiod_remove [brcmfmac] from brcmf_ops_sdio_remove+0x34/0x5c [brcmfmac]
 brcmf_ops_sdio_remove [brcmfmac] from sdio_bus_remove+0x30/0x10c
 sdio_bus_remove from device_release_driver_internal+0x190/0x204
 device_release_driver_internal from brcmf_sdio_firmware_callback+0x50/0x944 [brcmfmac]
 brcmf_sdio_firmware_callback [brcmfmac] from brcmf_fw_request_done+0x154/0x17c [brcmfmac]
 brcmf_fw_request_done [brcmfmac] from request_firmware_work_func+0x50/0x98
 request_firmware_work_func from process_one_work+0x260/0x7dc
 process_one_work from worker_thread+0x1ac/0x3b0
 worker_thread from kthread+0x128/0x168
 kthread from ret_from_fork+0x14/0x28
Exception stack(0xe0891fb0 to 0xe0891ff8)
...
irq event stamp: 18820
hardirqs last  enabled at (18826): [<c01c7d8c>] vprintk_emit+0x364/0x390
hardirqs last disabled at (18831): [<c01c7d48>] vprintk_emit+0x320/0x390
softirqs last  enabled at (18598): [<c0add65c>] __alloc_skb+0x168/0x1c0
softirqs last disabled at (18629): [<c013e038>] __irq_exit_rcu+0x144/0x1f0
---[ end trace 0000000000000000 ]---
WARNING: lib/refcount.c:25 at kthread_stop+0x2bc/0x364, CPU#0: kworker/0:1/10
refcount_t: addition on 0; use-after-free.
Modules linked in: brcmfmac hci_uart btbcm btintel bluetooth sha256 cfg80211 s5p_csis s5p_fimc s5p_mfc exynos4_is_common v4l2_fwnode v4l2_async ecdh_generic ecc s5p_jpeg videobuf2_dma_contig v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev brcmutil mc
CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G        W           7.0.0-rc6-next-20260402 #12549 PREEMPT 
Tainted: [W]=WARN
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events request_firmware_work_func
Call trace: 
 unwind_backtrace from show_stack+0x10/0x14
 show_stack from dump_stack_lvl+0x68/0x88
 dump_stack_lvl from __warn+0x94/0x204
 __warn from warn_slowpath_fmt+0x124/0x1bc
 warn_slowpath_fmt from kthread_stop+0x2bc/0x364
 kthread_stop from brcmf_sdio_remove+0x2c/0x194 [brcmfmac]
 brcmf_sdio_remove [brcmfmac] from brcmf_sdiod_remove+0x20/0xb8 [brcmfmac]
 brcmf_sdiod_remove [brcmfmac] from brcmf_ops_sdio_remove+0x34/0x5c [brcmfmac]
 brcmf_ops_sdio_remove [brcmfmac] from sdio_bus_remove+0x30/0x10c
 sdio_bus_remove from device_release_driver_internal+0x190/0x204
 device_release_driver_internal from brcmf_sdio_firmware_callback+0x50/0x944 [brcmfmac]
 brcmf_sdio_firmware_callback [brcmfmac] from brcmf_fw_request_done+0x154/0x17c [brcmfmac]
 brcmf_fw_request_done [brcmfmac] from request_firmware_work_func+0x50/0x98
 request_firmware_work_func from process_one_work+0x260/0x7dc
 process_one_work from worker_thread+0x1ac/0x3b0
 worker_thread from kthread+0x128/0x168
 kthread from ret_from_fork+0x14/0x28
Exception stack(0xe0891fb0 to 0xe0891ff8)
...
irq event stamp: 19327
hardirqs last  enabled at (19435): [<c01c3810>] __up_console_sem+0x50/0x60
hardirqs last disabled at (19460): [<c01c37fc>] __up_console_sem+0x3c/0x60
softirqs last  enabled at (19458): [<c013dc0c>] handle_softirqs+0x330/0x5b0
softirqs last disabled at (19443): [<c013e038>] __irq_exit_rcu+0x144/0x1f0
---[ end trace 0000000000000000 ]---

Best regards
Marek Szyprowski, PhD
Samsung R&D Institute Poland
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index d34db69c25a7..e6de88a6a852 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -2477,8 +2477,9 @@ static void brcmf_sdio_bus_stop(struct device *dev)
 	brcmf_dbg(TRACE, "Enter\n");
 
 	if (bus->watchdog_tsk) {
+		get_task_struct(bus->watchdog_tsk);
 		send_sig(SIGTERM, bus->watchdog_tsk, 1);
-		kthread_stop(bus->watchdog_tsk);
+		kthread_stop_put(bus->watchdog_tsk);
 		bus->watchdog_tsk = NULL;
 	}
 
@@ -4568,8 +4569,9 @@ void brcmf_sdio_remove(struct brcmf_sdio *bus)
 	if (bus) {
 		/* Stop watchdog task */
 		if (bus->watchdog_tsk) {
+			get_task_struct(bus->watchdog_tsk);
 			send_sig(SIGTERM, bus->watchdog_tsk, 1);
-			kthread_stop(bus->watchdog_tsk);
+			kthread_stop_put(bus->watchdog_tsk);
 			bus->watchdog_tsk = NULL;
 		}
 
-- 
2.34.1


^ permalink raw reply related

* RE: [PATCH wireless v2] wifi: iwlwifi: mld: stop TX during firmware restart
From: Sheroz Juraev @ 2026-04-16  9:02 UTC (permalink / raw)
  To: Miri Korenblit; +Cc: Johannes Berg, linux-wireless, stable

Hi Miri,

Thanks for the quick review. Let me address your points inline:

> Why is there a leak if we freeing the SKBs after we failed?

You're right, "leak" is not the precise term — the skbs are freed
after iwl_trans_tx() returns -EIO. The issue is allocation churn:
mac80211 keeps scheduling TX via wake_tx_queue, so
iwl_mld_tx_from_txq() keeps dequeuing new frames, passing them to
the dead firmware, getting -EIO, and freeing them — in a tight loop
for the entire duration of the firmware restart. The /proc/allocinfo
numbers I cited (10.8 GiB / 16.5M allocations) reflect cumulative
allocations during that window, not a persistent leak.

The practical impact is CPU waste (softirq spinning on
alloc-send-fail-free) and slab fragmentation from millions of rapid
kmalloc/kfree cycles, which can cause memory pressure on systems
with limited RAM. Adding the in_hw_restart guard eliminates this
churn entirely — same as the existing guard in the RX path.

> This was fixed by
> https://patchwork.kernel.org/project/linux-wireless/patch/
> 20260405054145.1064152-3-cole@unwrap.rs/

Thank you for pointing this out. Cole's patch fixes the TSO
segmentation explosion when AMSDU is disabled (max_tid_amsdu_len == 1
causing num_subframes == 0 → 32000 tiny segments). That's a
different code path from what I observed — my issue was the TX
dequeue loop running against dead firmware during restart, which
happens regardless of TSO/AMSDU state.

That said, the TSO segmentation explosion he fixed may explain
why the system freeze was so severe with TSO enabled — both bugs
could have been compounding. The in_hw_restart guard in my patch
would prevent both scenarios by stopping TX entirely before we
ever reach the TSO segmentation code.

> Not sure I understand if you have a new FW or not?

The ucode version string is the same: 101.6e695a70.0
(bz-b0-fm-c0-c101.ucode). But the linux-firmware package snapshot
changed — I was on an older nixpkgs snapshot when on kernel 6.19.5
(early March), and now I'm on linux-firmware-20260309. Since the
version string embedded in the ucode file is the same, the firmware
binary itself likely did not change. The NMI_INTERRUPT_UNKNOWN
crashes stopping may just be coincidental (different uptime,
different traffic patterns, or some other system-level change).

I don't have the old linux-firmware snapshot to do a binary diff,
so I can't say with certainty whether the firmware binary changed.
If you have a way to check internally whether there were firmware
fixes for Bz-series between, say, February and March 2026 releases,
that would clarify things.

Either way, the code path in iwl_mld_tx_from_txq() remains
unguarded — any firmware crash under TX load will hit the same
alloc churn. The RX path and TXQ allocation worker both check
in_hw_restart; the TX dequeue path should too.

Thanks,
Sheroz

^ permalink raw reply

* Wi-Fi speeds degrade from 600Mps to 30Mps while using WPA2 security, but not on open network, shortly after ISP firmware upgrade.
From: Benson Bear @ 2026-04-16  8:47 UTC (permalink / raw)
  To: linux-wireless

Hi folks, I've never posted here before, don't know much about wireless, but
am having a big problem I have been trying to solve for a week. I've been
googling and ai chatting non-stop but finally after reading the info
page about the
list figured it would probably be acceptable to send this message.

BRIEFEST SUMMARY: There was a firmware update in Rogers's (Canada) XB7 Gateway,
and subsequently my Wi-Fi transfer speeds degraded badly on all three
Linux notebooks I have. Fully up to date notebooks, running Fedora 43 and
42 with most recent kernel 6.19.11. Two different NICs: RTL8852BE and Intel
7265 (rev 59). Wired machines and phones are unaffected.

The machines all connect with high transfer rates of around
800-1000Mbs on the 5G band,
with an 80Mhz wide channnel, and MCS level ranging from 7 to 11 (HE and VHT).

Transfer speeds using WPA2 security have dropped in the one case (RTL)
from 600Mbps to
about 30Mbps. (Using internet speed test but iperf3 gives similar). The other
cases are similar.

BUT the transfer using no network security is still what it used to
be! It is simply
the enabling of WPA2 that brings them to their knees.

So it seems to be a problem related to WPA2, and at a lower level in the
stack of modules, since it happens on two different NICs?

I suspected for a long time that it was a firmware bug in the gateway, but
now I am starting to wonder. I have no solid evidence of that except that
Windows works fine on the same gateway and the same machine.

All three machines work well on another network I have occasional access
to, and have worked fine on this network until about a week ago.

I have ordered another router that I hope I can use to solve the
immediate practical problem, but I would really like to figure out
what is going on and contribute what I can to fixing it, even if only
by being sent out to gather potentially useful data.

Thank you.

^ permalink raw reply

* RE: [PATCH wireless v2] wifi: iwlwifi: mld: stop TX during firmware restart
From: Korenblit, Miriam Rachel @ 2026-04-16  8:46 UTC (permalink / raw)
  To: Sheroz Juraev
  Cc: Johannes Berg, linux-wireless@vger.kernel.org,
	stable@vger.kernel.org
In-Reply-To: <CADPJysx0mCpzh7b=kJC_OsZGvME9inx7EYo0imYwniCFO02FLg@mail.gmail.com>



> -----Original Message-----
> From: Sheroz Juraev <goodmartiandev@gmail.com>
> Sent: Thursday, April 16, 2026 11:37 AM
> To: Korenblit, Miriam Rachel <miriam.rachel.korenblit@intel.com>
> Cc: Johannes Berg <johannes@sipsolutions.net>; linux-wireless@vger.kernel.org;
> stable@vger.kernel.org
> Subject: RE: [PATCH wireless v2] wifi: iwlwifi: mld: stop TX during firmware
> restart
> 
> Hi Miri,
> 
> Thanks for looking into this. Unfortunately I don't have the raw dmesg logs from
> the original crash events — I didn't save them at the time and the journal has
> since rotated past those boots. I do have the system configuration details and the
> memory profiling data that led to the patch. Here's everything I can provide:
> 
> == Hardware / Firmware ==
> 
>   Machine:    ASUS Zenbook 14 UX3405CA
>   CPU:        Intel Core Ultra 9 285H (Arrow Lake), 16 cores
>   WiFi:       Intel(R) Wi-Fi 7 BE201 320MHz
>   PCI:        0000:00:14.3 [8086:7740] / subsystem [8086:00e4]
>   Interface:  wlo1 (renamed from wlan0)
>   Firmware:   101.6e695a70.0 bz-b0-fm-c0-c101.ucode, op_mode iwlmld
>   Kernel:     6.19.5 (when crashes were occurring)
>   OS:         NixOS (rolling release)
>   modprobe:   options iwlwifi power_save=0
>               options iwlmvm power_scheme=1
> 
> == Observed behavior (kernel 6.19.5) ==
> 
> Under sustained Tailscale (WireGuard) UDP traffic + active SSH sessions over
> WiFi, the firmware crashed with NMI_INTERRUPT_UNKNOWN approximately
> every 10–15 minutes. Each crash triggered ieee80211_restart_hw().
> 
> Two symptoms were observed after each firmware restart:
> 
> 1) Massive skb memory leak. Memory profiling (/proc/allocinfo)
>    showed the following after a single firmware crash cycle:
> 
>      10.8 GiB  16546157  net/core/skbuff.c:586  func:kmalloc_reserve
>       3.94 GiB  16546144  net/core/skbuff.c:679  func:__alloc_skb
Why is there a leak if we freeing the SKBs after we failed?
> 
>    ~7 GB of skb buffers leaked per crash. The TX path kept
>    dequeuing frames from mac80211 and pushing them to the dead
>    firmware (iwl_trans_tx() returning -EIO), allocating and
>    immediately freeing skbs in a tight loop.
> 
> 2) System freeze when TSO was enabled. With TSO/GSO active on
>    wlo1, the crash path through iwl_mld_tx_from_txq →
>    iwl_mld_tx_skb → iwl_tx_tso_segment → skb_segment →
>    skb_release_head_state caused an RCU stall → complete system
>    freeze. Disabling TSO/GSO via ethtool prevented the deadlock
>    but not the skb leak.
This was fixed by https://patchwork.kernel.org/project/linux-wireless/patch/20260405054145.1064152-3-cole@unwrap.rs/
> 
> == Workarounds applied ==
> 
>   - ethtool -K wlo1 tso off gso off  (prevents system freeze)
>   - systemd watchdog service monitoring journalctl for
>     "iwlwifi.*restart completed", then rmmod/modprobe cycle
>     to reclaim leaked skb memory
>   - net.core.wmem_max / rmem_max capped at 2MB (limits per-crash
>     memory consumption)
> 
> == Current status (kernel 6.19.11, linux-firmware 20260309) ==
> 
> On the current firmware (linux-firmware-20260309, same ucode version string
> 101.6e695a70.0), the NMI_INTERRUPT_UNKNOWN crashes have stopped
> entirely. I ran heavy SSH + Tailscale traffic for
> 10+ minutes with TSO re-enabled and no firmware crash occurred.
> 
> I checked the kernel changelogs: there are zero iwlwifi changes between 6.19.6
> and 6.19.11, so the stability improvement is most likely from the firmware
> package update (the linux-firmware snapshot changed between my 6.19.5
> system and the current one).
Not sure I understand if you have a new FW or not?
> 
> == Why the patch is still needed ==
> 
> Even if the specific NMI_INTERRUPT_UNKNOWN trigger has been fixed in newer
> firmware, the code path is still unguarded:
> iwl_mld_tx_from_txq() does not check mld->fw_status.in_hw_restart before
> dequeuing. Any future firmware crash under load would hit the same skb churn /
> memory leak. The RX path and TXQ allocation worker already have this guard —
> the TX dequeue path is the only one missing it.
> 
> Let me know if there's anything else I can provide, or if you'd like me to try
> reproducing on an older firmware version.
> 
> Thanks,
> Sheroz

^ permalink raw reply

* RE: [PATCH wireless v2] wifi: iwlwifi: mld: stop TX during firmware restart
From: Sheroz Juraev @ 2026-04-16  8:37 UTC (permalink / raw)
  To: Miri Korenblit; +Cc: Johannes Berg, linux-wireless, stable

Hi Miri,

Thanks for looking into this. Unfortunately I don't have the raw dmesg
logs from the original crash events — I didn't save them at the time
and the journal has since rotated past those boots. I do have the
system configuration details and the memory profiling data that led
to the patch. Here's everything I can provide:

== Hardware / Firmware ==

  Machine:    ASUS Zenbook 14 UX3405CA
  CPU:        Intel Core Ultra 9 285H (Arrow Lake), 16 cores
  WiFi:       Intel(R) Wi-Fi 7 BE201 320MHz
  PCI:        0000:00:14.3 [8086:7740] / subsystem [8086:00e4]
  Interface:  wlo1 (renamed from wlan0)
  Firmware:   101.6e695a70.0 bz-b0-fm-c0-c101.ucode, op_mode iwlmld
  Kernel:     6.19.5 (when crashes were occurring)
  OS:         NixOS (rolling release)
  modprobe:   options iwlwifi power_save=0
              options iwlmvm power_scheme=1

== Observed behavior (kernel 6.19.5) ==

Under sustained Tailscale (WireGuard) UDP traffic + active SSH
sessions over WiFi, the firmware crashed with NMI_INTERRUPT_UNKNOWN
approximately every 10–15 minutes. Each crash triggered
ieee80211_restart_hw().

Two symptoms were observed after each firmware restart:

1) Massive skb memory leak. Memory profiling (/proc/allocinfo)
   showed the following after a single firmware crash cycle:

     10.8 GiB  16546157  net/core/skbuff.c:586  func:kmalloc_reserve
      3.94 GiB  16546144  net/core/skbuff.c:679  func:__alloc_skb

   ~7 GB of skb buffers leaked per crash. The TX path kept
   dequeuing frames from mac80211 and pushing them to the dead
   firmware (iwl_trans_tx() returning -EIO), allocating and
   immediately freeing skbs in a tight loop.

2) System freeze when TSO was enabled. With TSO/GSO active on
   wlo1, the crash path through iwl_mld_tx_from_txq →
   iwl_mld_tx_skb → iwl_tx_tso_segment → skb_segment →
   skb_release_head_state caused an RCU stall → complete system
   freeze. Disabling TSO/GSO via ethtool prevented the deadlock
   but not the skb leak.

== Workarounds applied ==

  - ethtool -K wlo1 tso off gso off  (prevents system freeze)
  - systemd watchdog service monitoring journalctl for
    "iwlwifi.*restart completed", then rmmod/modprobe cycle
    to reclaim leaked skb memory
  - net.core.wmem_max / rmem_max capped at 2MB (limits per-crash
    memory consumption)

== Current status (kernel 6.19.11, linux-firmware 20260309) ==

On the current firmware (linux-firmware-20260309, same ucode
version string 101.6e695a70.0), the NMI_INTERRUPT_UNKNOWN crashes
have stopped entirely. I ran heavy SSH + Tailscale traffic for
10+ minutes with TSO re-enabled and no firmware crash occurred.

I checked the kernel changelogs: there are zero iwlwifi changes
between 6.19.6 and 6.19.11, so the stability improvement is most
likely from the firmware package update (the linux-firmware
snapshot changed between my 6.19.5 system and the current one).

== Why the patch is still needed ==

Even if the specific NMI_INTERRUPT_UNKNOWN trigger has been fixed
in newer firmware, the code path is still unguarded:
iwl_mld_tx_from_txq() does not check mld->fw_status.in_hw_restart
before dequeuing. Any future firmware crash under load would hit
the same skb churn / memory leak. The RX path and TXQ allocation
worker already have this guard — the TX dequeue path is the only
one missing it.

Let me know if there's anything else I can provide, or if you'd
like me to try reproducing on an older firmware version.

Thanks,
Sheroz

^ permalink raw reply

* Re: [PATCH wireless-next v8 2/3] wifi: cfg80211: add initial UHR support
From: Manish Dharanenthiran @ 2026-04-16  7:50 UTC (permalink / raw)
  To: Johannes Berg, Harshitha Prem, linux-wireless
  Cc: vasanthakumar.thiagarajan, Lorenzo Bianconi, ath12k, Jeff Johnson,
	Ping-Ke Shih, Jouni Malinen, Benjamin Berg
In-Reply-To: <1cf0ae795b0e3e95b38cb7abf84ffad34c187fdf.camel@sipsolutions.net>



On 3/13/2026 1:02 AM, Johannes Berg wrote:
> 
>>> Because of this, an event-driven approach was considered.
> 
> So - starting this again from scratch. Benjamin and I spent some time
> discussing this today too, and hashed out a (mostly?) workable solution
> that should address most of the issues. I'll try to summarise that
> below.
> 
Thanks for the detailed summary.

> As will become obvious - and that's why I quoted only the line _you_
> wrote before - this means we (including myself :)) need to stop being
> afraid of hostapd doing (soft?) real-time [1] tasks...
> 
> [1] I'm using that word in the (formal) sense of having a deadline, not
> of having to be particularly fast.
> 
> 
> Let's assume the following constraints:
> 
> - preparing a beacon template as a real-time task can be done by
> hostapd, given enough heads-up time
> - no periodic events in a steady state when the AP is operating
> normally
> - TSF drift between links is correctly handled (maintaining <=30us
> offset at any time)
> 

Handling beacon template within TBTT interval between links is possible 
in a model implementation but in real-time low-cost platforms where the 
AP is handling max clients (256) with multiple vdevs enabled might be a 
overhead for user-space to process.

> We evidently already make these assumptions:
> 
> - if beacon intervals are not the same, the TBTT offset in RNR is
> filled in by firmware (I see no way around this)
> - either firmware fills in TSF offset, or it's just zero, and not
> really accounting for slight drifts (but that's probably OK since it
> never adds up given the <=30us requirement)
> 
> And also let's introduce some new operations to driver/firmware:
> 
> - the firmware can drop a frame that it's not able to transmit before
> a given (as frame metadata) TSF value on the link, and indicate to
> the driver that this is the reason the frame was dropped
> - the firmware can create events at/after beacon TBTT (or beacon
> transmission), this can be controlled by the driver; these events
> contain the next TBTT's timestamp value
> - the TSF offsets between links can be known to the driver, if they can
> change (I suspect CSA could do that?) this can somehow be noticed by
> or given to the driver
> 
> With that, it seems we can redesign this whole thing to be event-driven
> and (mostly?) race-free.
> 
> In steady state, basically nothing would change from what hostapd is
> doing today. It simply configures beacon templates, occasionally updates
> them if elements need to change, and sends probe responses,
> (re)association responses etc. as usual.
> 
> During any sort of update (CSA, color change, EHT updates, UHR updates)
> things operate a bit differently:
> 
> 1) hostapd enables TBTT / beacon transmit events, these events would be
> generated by firmware and passed up, for each link, containing also the
> TBTT timestamp of the _next_ beacon to be transmitted
> 
> 2) hostapd waits for the TBTT event for the link that it wants to do the
> update on, ignoring events for other links
> 
> 3) starting from that TBTT event, on each TBTT event hostapd generates a
> new beacon template for the link the event was for, and configures it to
> the driver/firmware. Since that's a future beacon, it has to predict the
> content of that beacon using
> - the TBTT of the first beacon carrying the update announcement
> - the TSF offsets between the links
> - the beacon intervals of all the links
> (a bit more on this later)
> 
> 4) After applying the updates (a bit more on this later) and noticing
> that the announcements are finished, hostapd waits for one more TBTT
> event for each link and configures the beacons back to steady state,
> after which it turns off the events.
> 
> If, at any time during this, hostapd needs to send a probe response,
> (re)association response, EPP Capa/Operation response (or others?) which
> holds information about the updates with the current counter values,
> hostapd will create the frame per the current counters that it
> maintains, and will transmit this frame with a TSF cut-off value
> indicating that it must be transmitted before the next TBTT (over all
> links), or dropped.
> If this frame ends up being dropped by firmware because it didn't get
> out before the indicated TSF, hostapd gets a specific notification for
> this and then simply re-generates it and sends it again. This could
> possibly repeat if TBTTs are close together on multiple links, but I
> think it's not worth optimising for this case, though it could be done
> by deferring the response slightly based on timers, or at the expense of
> a more complex API ("defer until X and don't send after Y" vs. "don't
> send after Y"), neither seems really worthwhile.
> 

For dense client situations where we have the AP in a stadium or in any 
crowded place, the clients are moving between APs, we see that there 
will be more traffic drops for client connected as it gets dropped due 
to above condition. (Also, there is a chance the station might add the 
AP to the blacklist if there are more rejection while associating)

> 
> I said I'd give more information for (3) and (4) above, so:
> 
> For (3), also consider that it already has to effectively be able to do
> this for the templates thing we discussed, it has to predict what each
> link is going to look like in the future. I think this isn't too much of
> an issue, but care must be taken especially if beacon intervals differ.
> 
> For (4), I think the way how the updates are done may depend on what the
> update is. If, for example, it's DBE increasing the bandwidth, then
> could just do the update _before_ the 0 beacon is transmitted, and if
> it's decreasing bandwidth could do it _after_ the 0 beacon is
> transmitted. Some of these may potentially require management by the
> kernel or even driver/firmware (how do you switch NPCA parameters at the
> exact right point if not in FW?), and perhaps (especially for CSA?)

In offloaded case, firmware takes care of removing the newly added 
element(s) once the beacon with count 1 is sent. For CSA and ML 
reconfig, fimrware will send a completion event from which kernel/driver 
data will be updated.

> there will be some considerations regarding multiple interfaces.
> I mostly think this question is orthogonal, since armed with a TBTT
> hostapd could also request that this update be done at a given TBTT.
> 
> 
As mentioned above, this gets tricky in the cases which involves a low 
cost platforms in a dense client scenario. Users might not like having a 
glitch in their traffic (especially during live streaming) :)

> We haven't really been able to poke significant holes into this, but
> maybe that doesn't mean much. Couple of thoughts on that:
> 
>   * For each link, hostapd has roughly the whole beacon interval to build
>     the next beacon's template, which seems reasonable.
>   * There's a really weird corner case where an assoc response is
>     attempted to transmit just before a beacon, doesn't get an ACK, but a
>     retransmission isn't possible until after the beacon and it's dropped
>     due to the TSF cut-off. Doesn't seem worth worrying about.
>   * If the TBTTs for two links are at the same time, and the events to
>     userspace for them are not coming "updated link first", then the
>     beacon transmitted at the same time on the unchanged link may not yet
>     be announcing the update, depending on the event order, given that
>     hostapd waited for the affected link's first TBTT event. This doesn't
>     really seem like a problem, but I think could be addressed by
>     updating all the links on the first event immediately or so, or
>     (Benjamin prefers this I think) adding the first beacon's TBTT to the
>     response to the event enable command, I just worry that would cause
>     other races that would need to be addressed.
> 
> That's it for now :) Let me know what you think.
> 
> johannes

Thanks for the detailed answer with clear explanation. Yes, this is not 
much racy as we saw in the previous designs and addressed most of them.
We do think that this is suitable for the cases where the CPU load is 
very much low, also the time sensitiveness with respect to client 
handling is not a problem.

But, in real time use cases where the AP is getting deployed in crowded 
places, the CPU load spikes which might cause more traffic drops or 
connectivity issues. Also, if the AP is enabled with cloud analyzer 
which collects periodic stats, load will be high, thus chances for the 
user-space application to get the CPU cycle is highly subjective. That's 
why we preferred a offload solution where most of the time sensitive 
operation are within the firmware.

We will come up with a offloaded design that would be scalable for the 
upstream driver(s), also address the hwsim cases where we would want a 
minimal test cases for handling CU to facilitate upstream STA 
implementations.

Regards
Manish Dharanenthiran


^ permalink raw reply

* Re: [PATCH v2 1/3] wifi: wcn36xx: fix heap overflow from oversized firmware HAL response
From: Johannes Berg @ 2026-04-16  6:38 UTC (permalink / raw)
  To: Tristan Madani, Loic Poulain; +Cc: wcn36xx, linux-wireless
In-Reply-To: <20260415223710.1616925-2-tristmd@gmail.com>

Hi Tristan,

On Wed, 2026-04-15 at 22:37 +0000, Tristan Madani wrote:
> From: Tristan Madani <tristan@talencesecurity.com>
> 
> The firmware response dispatcher copies all synchronous HAL responses
> into the 4096-byte hal_buf without validating the response length. A
> response exceeding WCN36XX_HAL_BUF_SIZE causes a heap buffer overflow
> with firmware-controlled content.
> 
> Add a bounds check on the response length.

No real problem with these patches etc., but it seems implausible that
you're not using some kind of tool/LLM assistance, which you're supposed
to disclose (or at least I guess I'm supposed to ask you to):

https://docs.kernel.org/process/coding-assistants.html

johannes

^ permalink raw reply

* Re: [PATCH v2 2/2] wifi: b43: fix OOB read from hardware key index in b43_rx()
From: Jonas Gorski @ 2026-04-16  6:34 UTC (permalink / raw)
  To: Tristan Madani; +Cc: Johannes Berg, linux-wireless, b43-dev, linux-kernel
In-Reply-To: <20260415222425.1544638-3-tristmd@gmail.com>

Hi,

On Thu, Apr 16, 2026 at 12:24 AM Tristan Madani <tristmd@gmail.com> wrote:
>
> From: Tristan Madani <tristan@talencesecurity.com>
>
> The firmware-controlled key index in b43_rx() can exceed the dev->key[]
> array size (58 entries). The existing B43_WARN_ON is non-enforcing in
> production builds, allowing an out-of-bounds read of 1 byte from struct
> b43_firmware. A non-zero OOB value causes RX_FLAG_DECRYPTED to be
> incorrectly set on un-decrypted frames.
>
> Replace with an enforcing check that skips the key lookup for invalid
> indices.
>
> Fixes: e4d6b7951812 ("[B43]: add mac80211-based driver for modern BCM43xx devices")
> Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
> ---
> drivers/net/wireless/broadcom/b43/xmit.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/broadcom/b43/xmit.c b/drivers/net/wireless/broadcom/b43/xmit.c
> index XXXXXXX..XXXXXXX 100644
> --- a/drivers/net/wireless/broadcom/b43/xmit.c
> +++ b/drivers/net/wireless/broadcom/b43/xmit.c
> @@ -704,7 +704,10 @@ void b43_rx(struct b43_wldev *dev, struct sk_buff *skb, const void *_rxhdr)
>                  */
>                 keyidx = b43_kidx_to_raw(dev, keyidx);
> -               B43_WARN_ON(keyidx >= ARRAY_SIZE(dev->key));
> +               if (keyidx >= ARRAY_SIZE(dev->key)) {
> +                       b43dbg(dev->wl, "RX: invalid key index %u\n", keyidx);
> +                       goto drop;
> +               }

B43_WARN_ON() returns the condition's result, so if you keep it you
can shorten this to

if (B43_WARN_ON(keyidx >= ARRAY_SIZE(dev->key)))
        goto drop;

Best regards,
Jonas

^ permalink raw reply

* Re: [PATCH v2 1/2] wifi: b43: fix infinite loop from invalid hardware DMA RX slot
From: Jonas Gorski @ 2026-04-16  6:34 UTC (permalink / raw)
  To: Tristan Madani; +Cc: Johannes Berg, linux-wireless, b43-dev, linux-kernel
In-Reply-To: <20260415222425.1544638-2-tristmd@gmail.com>

Hi,

On Thu, Apr 16, 2026 at 12:24 AM Tristan Madani <tristmd@gmail.com> wrote:
>
> From: Tristan Madani <tristan@talencesecurity.com>
>
> b43_dma_rx() reads current_slot from hardware via get_current_rxslot().
> If the value is >= ring->nr_slots, the B43_WARN_ON only warns but
> continues. The for loop then never terminates because next_slot() wraps
> modulo nr_slots and can never reach the out-of-range current_slot.
>
> Replace the B43_WARN_ON with an explicit bounds check that returns
> early when the hardware reports an invalid slot index.
>
> Fixes: e4d6b7951812 ("[B43]: add mac80211-based driver for modern BCM43xx devices")
> Signed-off-by: Tristan Madani <tristan@talencesecurity.com>
> ---
> drivers/net/wireless/broadcom/b43/dma.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/broadcom/b43/dma.c b/drivers/net/wireless/broadcom/b43/dma.c
> index XXXXXXX..XXXXXXX 100644
> --- a/drivers/net/wireless/broadcom/b43/dma.c
> +++ b/drivers/net/wireless/broadcom/b43/dma.c
> @@ -1693,7 +1693,10 @@ void b43_dma_rx(struct b43_dmaring *ring)
>         B43_WARN_ON(ring->tx);
>         current_slot = ops->get_current_rxslot(ring);
> -       B43_WARN_ON(!(current_slot >= 0 && current_slot < ring->nr_slots));
> +       if (!(current_slot >= 0 && current_slot < ring->nr_slots)) {
> +               B43_WARN_ON(1);
> +               return;
> +       }

B43_WARN_ON() returns the condition's result, so you can shorten this to

if (B43_WARN_ON(!(current_slot >= 0 && current_slot < ring->nr_slots)))
        return;

Best regards,
Jonas

^ permalink raw reply

* Re: [PATCH v2] wifi: rtw89: phy: increase RF calibration timeouts for USB transport
From: Louis Kotze @ 2026-04-16  4:56 UTC (permalink / raw)
  To: pkshih; +Cc: linux-wireless, linux-kernel, rtl8821cerfe2, lucid_duck,
	Louis Kotze
In-Reply-To: <1a90ff00d83c47b995cf75165b2a304b@realtek.com>

> Can we remove this phrase? No need to mention v1 in commit message.

Done -- reworded to stand alone without referencing prior versions.

> I'm not sure this should be called "bug", as Bitterblue has not
> adjusted these timeout time by earlier version.

Fair point -- the timeouts were correct for PCIe; USB was not in
scope yet. Changed to "condition" instead of "bug".

> I'm also not sure if this is correct. The calibration time of DACK
> might rely on WiFi hardware and external components, not only I/O
> speed.

You're right, I overclaimed. Reworded to note that transport
round-trip latency appears to dominate under these test conditions,
but hardware and external component factors may also contribute.

All three changes applied in v3. Also added Tested-by tags from
Devin Wittmayer across RTL8922AU/8852AU/8852BU/8852CU (Framework 13
and Raspberry Pi 5), and Reported-by with a link to his xHCI hard
lockup evidence.

Thank you for the review.


^ permalink raw reply

* [PATCH v3] wifi: rtw89: phy: increase RF calibration timeouts for USB transport
From: Louis Kotze @ 2026-04-16  4:55 UTC (permalink / raw)
  To: pkshih; +Cc: linux-wireless, linux-kernel, rtl8821cerfe2, lucid_duck,
	Louis Kotze

USB transport adds significant latency to H2C/C2H round-trips used
by RF calibration. The existing timeout values were designed for PCIe
and are too tight for USB, causing "failed to wait RF DACK",
"failed to wait RF TSSI" and similar errors on USB adapters.

Apply a 4x timeout multiplier when the device uses USB transport.
The multiplier is applied in rtw89_phy_rfk_report_wait() so all
calibrations benefit without changing any call sites or PCIe
timeout values.

The 4x multiplier was chosen based on measured data from two
independent testers (RTL8922AU, 6GHz MLO and 2.4/5GHz):

  Calibration   PCIe timeout   Max measured (USB)   4x timeout
  PRE_NTFY           5ms              1ms              20ms
  DACK              58ms             72ms             232ms
  RX_DCK           128ms            374ms             512ms
  TSSI normal       20ms             24ms              80ms
  TSSI scan          6ms             14ms              24ms
  TXGAPK            54ms             18ms             216ms
  IQK               84ms             53ms             336ms
  DPK               34ms             30ms             136ms

Tested with RTL8922AU on 6GHz MLO (5GHz + 6GHz simultaneous):
25 connect/disconnect cycles with zero failures.

The 4x multiplier was also verified under adverse host conditions
on 5GHz. 5 cycles per scenario, stress-ng as the load generator,
max observed time per calibration:

  Calibration  PCIe  4x   Baseline  CPU stress  Mem stress  Combined
  PRE_NTFY       5   20     0         0           0           1
  DACK          58  232    71 (!)    71 (!)      71 (!)      71 (!)
  RX_DCK       128  512    23        22          22          23
  IQK           84  336    53        53          53          53
  DPK           34  136    23        23          26          23
  TSSI          20   80     6         9          14           9
  TXGAPK        54  216    16        16          16          16

Legend: (!) = exceeds PCIe budget but within 4x budget.

Two observations from that matrix:

1. DACK exceeds the stock PCIe budget (58ms) in baseline on 5GHz
   on this hardware. Without the 4x multiplier, DACK fails
   -ETIMEDOUT deterministically on every connect, no stress
   needed. This is the condition the patch addresses.

2. Calibration times appear dominated by USB transport round-trip
   latency rather than host load, though hardware and external
   component factors may also contribute. DACK stays at 71ms
   across all four scenarios. Host-side stress has essentially
   zero effect on observed calibration duration. Bumping the
   multiplier above 4x would not address a failure mode that
   this stress matrix produces.

Reported-by: Devin Wittmayer <lucid_duck@justthetip.ca>
Link: https://github.com/Lucid-Duck/rtw89-usb3-gap/tree/main/evidence/crash-2026-04-11
Signed-off-by: Louis Kotze <loukot@gmail.com>
Tested-by: Devin Wittmayer <lucid_duck@justthetip.ca> # RTL8922AU (BrosTrend BE6500)
Tested-by: Devin Wittmayer <lucid_duck@justthetip.ca> # RTL8852AU (D-Link DWA-X1850 A1)
Tested-by: Devin Wittmayer <lucid_duck@justthetip.ca> # RTL8852AU (D-Link DWA-X1850 B1)
Tested-by: Devin Wittmayer <lucid_duck@justthetip.ca> # RTL8852BU (BrosTrend AX4L)
Tested-by: Devin Wittmayer <lucid_duck@justthetip.ca> # RTL8852CU (EDUP AX5400)
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
---
Changes since v2:
  - Reword commit message per Ping-Ke review: remove v1 reference
    from permanent changelog, use "condition" instead of "bug",
    acknowledge hardware factors in calibration timing rather than
    asserting I/O bound.
  - Add Tested-by tags from Devin Wittmayer across 4 chipsets
    (RTL8922AU, RTL8852AU, RTL8852BU, RTL8852CU) on Framework 13 +
    Fedora 43 (6.19.11) and Raspberry Pi 5 + Pi OS (6.12.47).
  - Add Reported-by for independent confirmation including xHCI hard
    lockup evidence (CPU5 deadlock in usb_unanchor_urb after DACK
    timeout triggered driver recovery).

v2: https://lore.kernel.org/linux-wireless/20260415111339.453602-1-loukot@gmail.com/
v1: https://lore.kernel.org/linux-wireless/20260410080017.82946-1-loukot@gmail.com/

 drivers/net/wireless/realtek/rtw89/phy.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtw89/phy.c b/drivers/net/wireless/realtek/rtw89/phy.c
index e70d0e283..1f249c297 100644
--- a/drivers/net/wireless/realtek/rtw89/phy.c
+++ b/drivers/net/wireless/realtek/rtw89/phy.c
@@ -3956,6 +3956,14 @@ int rtw89_phy_rfk_report_wait(struct rtw89_dev *rtwdev, const char *rfk_name,
 	struct rtw89_rfk_wait_info *wait = &rtwdev->rfk_wait;
 	unsigned long time_left;
 
+	/*
+	 * USB transport adds latency to H2C/C2H round-trips, so RF
+	 * calibrations take longer than on PCIe. Apply a 4x multiplier
+	 * to avoid spurious timeouts.
+	 */
+	if (rtwdev->hci.type == RTW89_HCI_TYPE_USB)
+		ms *= 4;
+
 	/* Since we can't receive C2H event during SER, use a fixed delay. */
 	if (test_bit(RTW89_FLAG_SER_HANDLING, rtwdev->flags)) {
 		fsleep(1000 * ms / 2);

base-commit: 1e33ef7657531b2361d53cca25f375b5626e76a9
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v2] ath11k: fix peer resolution on rx path when peer_id=0
From: Baochen Qiang @ 2026-04-16  3:04 UTC (permalink / raw)
  To: Matthew Leach, Jeff Johnson; +Cc: linux-wireless, ath11k, linux-kernel, kernel
In-Reply-To: <20260415-ath11k-null-peerid-workaround-v2-1-2abae3bbac16@collabora.com>



On 4/15/2026 7:39 PM, Matthew Leach wrote:
> It has been observed that on certain chipsets a peer can be assigned
> peer_id=0. For reception of standard MPDUs this is fine as
> ath11k_dp_rx_h_find_peer() has a fallback case where it locates the peer
> based upon the source mac address.
> 
> However, on an aggregated link, reception of AMSDUs results in the peer
> not being resolved for the second (any any subsequent) sub-MSDUs due to
> the peer_id guard in ath11k_dp_rx_h_find_peer(). This causes the

it is necessary to point out that the mac address based fallback does not work for those
sub-MSDUs as well, since the mpdu_start descriptor from where mac address is obtained is
not populated by hardware.

> encryption type of the frame to be set to an incorrect value, resulting
> in the sub-MSDUs being dropped by ieee80211.
> 
> ath11k_pci 0000:03:00.0: data rx skb 000000002f4b704d len 1534 peer xx:xx:xx:xx:xx:xx 0 ucast sn 3063 he160 rate_idx 9 vht_nss 2 freq 5240 band 1 flag 0x40d1a fcs-err 0 mic-err 0 amsdu-more 0 peer_id 0 first_msdu 1 last_msdu 0
> ath11k_pci 0000:03:00.0: data rx skb 0000000038acd580 len 1534 peer (null) 0 ucast sn 3063 he160 rate_idx 9 vht_nss 2 freq 5240 band 1 flag 0x40d00 fcs-err 0 mic-err 0 amsdu-more 0 peer_id 0 first_msdu 0 last_msdu 1
> 
> This patch removes the null peer_id check in ath11k_dp_rx_h_find_peer(),
> allowing peer's with an assigned ID of 0 to be resolved.
> 
> Signed-off-by: Matthew Leach <matthew.leach@collabora.com>
> ---
> Changes in v2:
> 
> - Since peer_id=0 is a valid condition on some chips, remove the guard
>   that prevented the peer lookup.
> - Link to v1: https://patch.msgid.link/20260326-ath11k-null-peerid-workaround-v1-1-0c2fd53202f8@collabora.com
> 
> To: Jeff Johnson <jjohnson@kernel.org>
> Cc: linux-wireless@vger.kernel.org
> Cc: ath11k@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/net/wireless/ath/ath11k/dp_rx.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c
> index 49d959b2e148..ff2c78a4e5f3 100644
> --- a/drivers/net/wireless/ath/ath11k/dp_rx.c
> +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c
> @@ -2215,8 +2215,7 @@ ath11k_dp_rx_h_find_peer(struct ath11k_base *ab, struct sk_buff *msdu)
>  
>  	lockdep_assert_held(&ab->base_lock);
>  
> -	if (rxcb->peer_id)
> -		peer = ath11k_peer_find_by_id(ab, rxcb->peer_id);
> +	peer = ath11k_peer_find_by_id(ab, rxcb->peer_id);
>  
>  	if (peer)
>  		return peer;

the other instance in ath11k_hal_rx_parse_mon_status_tlv() is missed.

> 
> ---
> base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
> change-id: 20260326-ath11k-null-peerid-workaround-625a129781b1
> 
> Best regards,
> --  
> Matt
> 


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox