Linux wireless drivers development
 help / color / mirror / Atom feed
* Re: [PATCH] wifi: ath11k: apply existing PM quirk to ThinkPad P14s Gen 5 AMD
From: Kyle Farnung @ 2026-03-31 15:40 UTC (permalink / raw)
  To: kfarnung
  Cc: jjohnson, mpearson-lenovo, stable, linux-wireless, ath11k,
	linux-kernel
In-Reply-To: <20260330-p14s-pm-quirk-v1-1-cf2fa39cc2d5@gmail.com>

On Mon, Mar 30, 2026 at 11:18 PM Kyle Farnung via B4 Relay
<devnull+kfarnung.gmail.com@kernel.org> wrote:
>
> From: Kyle Farnung <kfarnung@gmail.com>
>
> Some ThinkPad P14s Gen 5 AMD systems experience suspend/resume
> reliability issues similar to those reported in [1]. These platforms
> were not previously included in the ath11k PM quirk table.
>
> Add DMI matches for product IDs 21ME and 21MF to apply the existing
> ATH11K_PM_WOW override, improving suspend/resume behavior on these
> systems.
>
> Tested on a ThinkPad P14s Gen 5 AMD (21ME) running 6.19.9.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=219196
> [2] https://pcsupport.lenovo.com/us/en/products/laptops-and-netbooks/thinkpad-p-series-laptops/thinkpad-p14s-gen-5-type-21me-21mf/
>
> Fixes: ce8669a27016 ("wifi: ath11k: determine PM policy based on machine model")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kyle Farnung <kfarnung@gmail.com>
> ---
>  drivers/net/wireless/ath/ath11k/core.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
> index 3f6f4db5b7ee1aba79fd7526e5d59d068e0f4a2e..21d366224e75904feeae6cb9c93d9ef692d127fe 100644
> --- a/drivers/net/wireless/ath/ath11k/core.c
> +++ b/drivers/net/wireless/ath/ath11k/core.c
> @@ -1041,6 +1041,20 @@ static const struct dmi_system_id ath11k_pm_quirk_table[] = {
>                         DMI_MATCH(DMI_PRODUCT_NAME, "21D5"),
>                 },
>         },
> +       {
> +               .driver_data = (void *)ATH11K_PM_WOW,
> +               .matches = { /* P14s G5 AMD #1 */
> +                       DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
> +                       DMI_MATCH(DMI_PRODUCT_NAME, "21ME"),
> +               },
> +       },
> +       {
> +               .driver_data = (void *)ATH11K_PM_WOW,
> +               .matches = { /* P14s G5 AMD #2 */
> +                       DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
> +                       DMI_MATCH(DMI_PRODUCT_NAME, "21MF"),
> +               },
> +       },
>         {}
>  };
>
>
> ---
> base-commit: dbd94b9831bc52a1efb7ff3de841ffc3457428ce
> change-id: 20260330-p14s-pm-quirk-0a51ba19235f
>
> Best regards,
> --
> Kyle Farnung <kfarnung@gmail.com>
>
>

Apologies to everyone, I realized that I lost my CC list in the sending
process. I wasn't sure what the right fix was so I ended up posting a v2
patch [1] instead.

[1] https://lore.kernel.org/linux-wireless/20260330-p14s-pm-quirk-v2-1-ef18ce07996b@gmail.com/

Thanks,
Kyle

^ permalink raw reply

* Re: [PATCH ath-next v3 1/6] dt-bindings: net: wireless: add ath12k wifi device IPQ5424
From: Jeff Johnson @ 2026-03-31 15:44 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Raj Kumar Bhagat
  Cc: Johannes Berg, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jeff Johnson, linux-wireless, devicetree, linux-kernel, ath12k
In-Reply-To: <0de0574a-04ce-45d0-946d-5fdc1a7b8181@kernel.org>

On 3/31/2026 7:42 AM, Krzysztof Kozlowski wrote:
> On 31/03/2026 16:23, Jeff Johnson wrote:
>> On 3/31/2026 12:24 AM, Krzysztof Kozlowski wrote:
>>> On Tue, Mar 31, 2026 at 02:09:06AM +0530, Raj Kumar Bhagat wrote:
>>>>  $id: http://devicetree.org/schemas/net/wireless/qcom,ipq5332-wifi.yaml#
>>>> @@ -17,6 +17,7 @@ properties:
>>>>    compatible:
>>>>      enum:
>>>>        - qcom,ipq5332-wifi
>>>> +      - qcom,ipq5424-wifi
>>>
>>> No, use previous patch.
>>>
>>> I am annoyed that you keep making changes even for such trivialities and
>>> require re-review from the community.  Previous patch was correct. This
>>> one doing whatever you want to do in copyrights is too much. You don't
>>> change copyrights just because you wrote one device model.
>>
>> Krzysztof,
>>
>> FYI here is the guidance I received from Qualcomm legal (links to internal
>> documentation, removed -- I've forwarded the entire e-mail to your Qualcomm
>> mailbox):
> 
> As I explained already more than once, legal can engage in open source
> discussions directly. I am not going to discuss with them via proxies.
> 
>>
>> ... Repos under copyleft license [...] QTI copyright must be added when we
>> make significant changes.
>>
>> ... Repos under friendly license (BSD, Apache, MIT, ...) [...] QTI copyright
>> must be added for any changes, not just significant ones.
>>
>> ... under the regular QUIC to QTI open-source copyright transitioning [...]
>> all QUIC Copyright instances should be replaced with year-less QTI OSS Copyright.
>>
>> I'll follow up with them on this case where there is a dual-license file.
> 
> You nicely removed the quote where they ask to follow what the upstream
> maintainer asks for. So as one of the maintainers I ask not to change
> it, because it is churn and pointless waste of my time.

Although I feel the latest patch correctly represents Qualcomm legal guidance,
I'm not going to insist upon the copyright change.

/jeff

^ permalink raw reply

* Re: [PATCH v4 2/3] ath10k: Add device-tree quirk to skip host cap QMI requests
From: Vasanthakumar Thiagarajan @ 2026-03-31 16:06 UTC (permalink / raw)
  To: david, Johannes Berg, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Jeff Johnson, Bjorn Andersson, Konrad Dybcio,
	Paul Sajna
  Cc: Amit Pundir, linux-wireless, devicetree, ath10k, linux-kernel,
	linux-arm-msm, phone-devel
In-Reply-To: <20260325-skip-host-cam-qmi-req-v4-2-bc08538487aa@ixit.cz>



On 3/25/2026 11:27 PM, David Heidelberg via B4 Relay wrote:
> From: Amit Pundir <amit.pundir@linaro.org>
> 
> Some firmware versions do not support the host capability QMI request.
> Since this request occurs before firmware-N.bin and board-M.bin are
> loaded, the quirk cannot be expressed in the firmware itself.
> 
> The root cause is unclear, but there appears to be a generation of
> firmware that lacks host capability support.
> 
> Without this quirk, ath10k_qmi_host_cap_send_sync() returns
> QMI_ERR_MALFORMED_MSG_V01 before loading the firmware. This error is not
> fatal - Wi-Fi services still come up successfully if the request is simply
> skipped.
> 
> Add a device-tree quirk to skip the host capability QMI request on devices
> whose firmware does not support it.
> 
> For example, firmware build
> "QC_IMAGE_VERSION_STRING=WLAN.HL.2.0.c3-00257-QCAHLSWMTPLZ-1"
> on Xiaomi Poco F1 phone requires this quirk.
> 
> Suggested-by: Bjorn Andersson <andersson@kernel.org>
> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
> Signed-off-by: David Heidelberg <david@ixit.cz>

Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>

^ permalink raw reply

* Re: [PATCH 1/3] wifi: mt76: connac: use a helper to cache  txpower_cur
From: lucid_duck @ 2026-03-31 16:41 UTC (permalink / raw)
  To: Felix Baumann
  Cc: linux-mediatek, linux-wireless, lorenzo.bianconi, morrownr, nbd,
	satadru, sean.wang, jonas.gorski
In-Reply-To: <3f4103d9-4871-4ae8-93a7-d286fce37443@freifunk-aachen.de>

Hi Felix,

Thank you, and thanks for reaching out.
Yes, absolutely -- my real name is Devin Wittmayer.
Sean is welcome to use it for the Co-developed-by tag.

Cheers,

Devin



March 31, 2026 at 12:59 AM, "Felix Baumann" <felix.baumann@freifunk-aachen.de> wrote:


> 
> Hi Lucid Duck,
> 
> thanks a lot for your contribution. Very much appreciated :)
> The kernel has a strict policy for contributions: real names only.
> https://www.kernel.org/doc/html/v4.11/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin
> 
> May I kindly ask whether it would be okay for Sean to use your real name?
> https://github.com/Lucid-Duck
> 
> -- 
> Regards
> Felix Baumann
>

^ permalink raw reply

* Re: [PATCH 13/16] carl9170: rx: gate data frame delivery on STARTED state
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <17fe2464-e8a1-4f2b-a024-a78bcf460bcd@gmail.com>

On 3/21/26 11:34 PM, Christian Lamparter wrote:
> If what you write is true for an up-to-date kernel, this needs to be
> addressed in mac80211. Under no circumstance should mac80211 behave
> that way... for any driver, in any case.
> Can you please post the panics/errors/warnings?

Confirmed — it is true and you are right that mac80211 is where the fix
belongs. Here is what happened:

On 2026-03-29 we had a complete hard freeze during testing: black screen,
machine required a power cycle. Being a hard freeze rather than a kernel
oops, nothing was written to disk. The journal for that day is empty.

We were able to reconstruct the exact race from code analysis:

  1. carl9170 firmware deadlocks during a restart attempt
  2. ieee80211_reconfig() calls carl9170_op_start() → -ETIMEDOUT
  3. ieee80211_handle_reconfig_failure() clears IEEE80211_SDATA_IN_DRIVER
     on all interfaces but does NOT call drv_stop()
  4. cfg80211_shutdown_all_interfaces() → ieee80211_do_stop() →
     sta_info_flush() / sta_info_destroy_part2() frees station data
  5. USB RX tasklet still running, calls ieee80211_rx_napi() with
     references to the freed station data → use-after-free → hard freeze

The fix is in mac80211: ieee80211_handle_reconfig_failure() must stop
the hardware before clearing IN_DRIVER state. I've prepared a patch
and am sending it separately to Johannes Berg:

  mac80211: stop hardware before clearing driver state on reconfig failure

  When ieee80211_handle_reconfig_failure() is called after a failed HW
  reconfiguration, it clears IEEE80211_SDATA_IN_DRIVER flags on all
  interfaces but does not stop the hardware. This creates a race window:
  cfg80211_shutdown_all_interfaces() subsequently calls ieee80211_do_stop()
  which runs sta_info_flush() to destroy stations, while the driver's RX
  path may still be delivering frames that reference station data being
  freed.

  The fix stops the hardware in ieee80211_handle_reconfig_failure() before
  clearing IN_DRIVER state, ensuring no driver can deliver RX frames once
  the teardown begins. The drv_stop() call is guarded by local->started
  since some call sites reach this function after drv_start() has already
  failed (where the hardware was never started).

  --- a/net/mac80211/util.c
  +++ b/net/mac80211/util.c
  @@ -1614,6 +1614,18 @@ static void ieee80211_handle_reconfig_failure(...)
     local->resuming = false;
     local->suspended = false;
  +
  +  /*
  +   * Stop the hardware before clearing IN_DRIVER state. Without this,
  +   * cfg80211_shutdown_all_interfaces() tears down stations via
  +   * sta_info_flush() while the driver's RX path may still deliver
  +   * frames referencing station data being freed, causing use-after-free.
  +   * Guard with local->started since this function can be reached when
  +   * drv_start() itself failed (hardware never started).
  +   */
  +  if (local->started)
  +      drv_stop(local, false);
  +
     local->in_reconfig = false;
     local->reconfig_failure = true;

I've also added two driver-side guards (sent separately as BUG-012/013):
- carl9170_echo_test() in op_start() to fail fast if firmware is dead
- IS_STARTED() guard in op_stop() to skip USB writes if op_start() never
  completed

These reduce the probability of hitting the race but do not close it —
only the mac80211 fix prevents any driver from hitting this path.

The carl9170-side patch 13 (this series) remains necessary as defence-in-
depth: even with the mac80211 fix, the IS_STARTED() gate prevents frames
arriving during the brief window between drv_stop() and teardown.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 14/16] carl9170: main: guard op_config and bss_info_changed against non-STARTED state
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <4558729b-1248-476d-8262-56ddb228812a@gmail.com>

On 3/21/26 11:28 PM, Christian Lamparter wrote:
> Resources exhaustion? Kernel Panic? This does sound like an embedded device!
> Have you checked for leaks? ( https://docs.kernel.org/dev-tools/kmemleak.html )
> Can you please post such a panic?

Sorry for the misleading phrasing — "resource exhaustion" was wrong. What
actually happens is simpler:

During teardown/restart (state = IDLE), mac80211 calls carl9170_op_config()
and carl9170_op_bss_info_changed(), which attempt register writes. The
firmware is already dead, so each write fails with -EIO. On our hardware
(AR9170, which uses IS_ACCEPTING_CMD() = state >= IDLE) we see these in
dmesg for every restart cycle:

  ieee80211 phy31: writing reg 0x1c36f0 (val 0x2400) failed (-5)
  ieee80211 phy31: writing reg 0x1c36f0 (val 0x5000) failed (-5)

The "30+ cycles causing a panic" claim in the original commit message was
overstated — the -EIO flood did not cause the kernel panic directly. The
panic (hard freeze) was a separate mac80211 reconfig failure race described
in my reply to patch 13.

This patch still fixes a real problem: unnecessary -EIO writes and their
associated error log spam during every teardown cycle. The IS_STARTED()
guard makes the code correct and eliminates the noise, but it is a minor
improvement, not a critical fix.

Apologies for the overstated description. If you'd prefer, I can resend
with a corrected commit message.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* [PATCH] mac80211: stop hardware before clearing driver state on reconfig failure
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, Christian Lamparter

From: Masi Osmani <mas-i@hotmail.de>
Date: Thu, 12 Mar 2026 02:00:00 +0100

When ieee80211_handle_reconfig_failure() is called after a failed HW
reconfiguration, it clears IEEE80211_SDATA_IN_DRIVER flags on all
interfaces but does not stop the hardware. This creates a race window:
cfg80211_shutdown_all_interfaces() subsequently calls ieee80211_do_stop()
which runs sta_info_flush() to destroy stations, while the driver's RX
path may still be delivering frames that reference station data being
freed.

This race was observed with the carl9170 driver: when firmware
deadlocks during a restart attempt, ieee80211_reconfig() fails
at drv_add_interface(). The subsequent interface teardown triggers
sta_info_destroy_part2() while the USB RX tasklet still calls
ieee80211_rx_napi(), causing a use-after-free kernel panic (hard
system freeze, power cycle required — no oops saved to disk).

The race condition:
  1. carl9170 firmware deadlocks → ieee80211_reconfig() called
  2. carl9170_op_start() returns -ETIMEDOUT
  3. ieee80211_handle_reconfig_failure() clears IN_DRIVER flags
     but does NOT call drv_stop() → hardware still running
  4. cfg80211_shutdown_all_interfaces() → sta_info_flush() frees
     station data
  5. USB RX tasklet delivers frames via ieee80211_rx_napi() →
     use-after-free → hard freeze

The fix stops the hardware in ieee80211_handle_reconfig_failure() before
clearing IN_DRIVER state, ensuring no driver can deliver RX frames once
the teardown begins. The drv_stop() call is guarded by local->started
since some call sites reach this function after drv_start() has already
failed (where the hardware was never started).

Christian Lamparter (carl9170 original author) reviewed the carl9170-side
companion patches and confirmed: "If what you write is true for an
up-to-date kernel, this needs to be addressed in mac80211."

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
 net/mac80211/util.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/mac80211/util.c b/net/mac80211/util.c
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1614,6 +1614,18 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)

 	local->resuming = false;
 	local->suspended = false;
+
+	/*
+	 * Stop the hardware before clearing IN_DRIVER state. Without this,
+	 * cfg80211_shutdown_all_interfaces() tears down stations via
+	 * sta_info_flush() while the driver's RX path may still deliver
+	 * frames referencing station data being freed, causing use-after-free.
+	 * Guard with local->started since this function can be reached when
+	 * drv_start() itself failed (hardware never started).
+	 */
+	if (local->started)
+		drv_stop(local, false);
+
 	local->in_reconfig = false;
 	local->reconfig_failure = true;

-- 
2.51.0

^ permalink raw reply

* Re: [PATCH 07/10] carl9170: main: add exponential restart backoff
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <a92b853c-fc20-4c6e-b587-ffa8f0f66ba4@gmail.com>

On 3/21/26 9:42 PM, Christian Lamparter wrote:
> I guess you are refering with "window-based counting" to the
> CARL9170_FW_ERROR_WINDOW_MS ? But is it used anywhere?
> Maybe part of the patch is missing, or is there another patch?
>
> Otherwise, I would be inclined to test this out (in other words: ack it).

You are correct on both counts.

CARL9170_FW_ERROR_WINDOW_MS is defined but never wired up — the window-
based counting did not make it into the final patch. That constant is dead
code and should be removed.

Also, the change from `> 3` to `>= CARL9170_FW_ERROR_THRESHOLD` (where
CARL9170_FW_ERROR_THRESHOLD = 3) changes the trigger point from 4 errors
to 3. That is an unintentional off-by-one. I will set the threshold to 4
to preserve the original behaviour, or alternatively keep 3 and document
the intentional tightening — whichever you prefer.

I will send a v2 with both fixes:
- Remove CARL9170_FW_ERROR_WINDOW_MS (dead constant)
- Set CARL9170_FW_ERROR_THRESHOLD = 4 to match the original > 3 behaviour

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 08/10] carl9170: phy: enable antenna diversity for 2-chain devices
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <46d3a248-bc59-4c60-85ba-88d80afa8dc3@gmail.com>

On 3/21/26 8:53 PM, Christian Lamparter wrote:
> Oh, no. It does program them! It's part of the ar5416_phy_init array
> in phy.c.
>
> So. From what I remember, this was the reason why I copied the
> definitions over from the ath9k driver to carl9170. Because this
> register must be important if they are part of the init values. But I
> don't know if these definitions are the same for AR9170 and AR9285.
> That's why the AR9285 is there.
>
> As for ar5416_phy_init values. They came from the original OTUS driver.

You are right that ar5416_phy_init[] programs MULTICHAIN_GAIN_CTL as
part of the register init sequence in carl9170_init_phy().

The patch does not duplicate that. Here is the call order in
carl9170_set_channel():

  carl9170_init_phy()                  ← runs ar5416_phy_init[] loop
    └─ writes initial MULTICHAIN_GAIN_CTL value from OTUS defaults
  carl9170_init_phy_from_eeprom()      ← our patch adds code here
    └─ calls carl9170_def_val(AR9170_PHY_REG_MULTICHAIN_GAIN_CTL)
       to read back the same OTUS default, then ORs in the diversity
       control bits and writes the combined value

So the diversity configuration is layered on top of the existing init
value, not replacing it. The relevant bits we set (ANT_DIV_CTL,
ALT_LNACONF, MAIN_LNACONF, ALT_GAINTB, MAIN_GAINTB,
BB_ENABLE_ANT_FAST_DIV) are not set in the ar5416_phy_init[] defaults
— the OTUS values leave them at zero.

You raise a fair point about whether the AR9285 LNA definitions map
correctly to the AR9170 RF frontend. I used the AR9285 values because
the OTUS driver is the only reference we have for AR9170-specific
register programming, and the antenna diversity block appears to share
the same register layout. The improvement is measurable on our
Fritz!WLAN N hardware (stronger signal in multipath environments with
the patch applied), but I cannot guarantee this holds for all AR9170
variants.

If you prefer, I can add a Kconfig option or a module_parameter to make
this opt-in, similar to your suggestion for the SGI patch.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 11/12] carl9170: skip cross-band channel changes during sw scan
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel, Johannes Berg
In-Reply-To: <73153743-e0e8-4f2d-8774-066f53460511@gmail.com>

On 3/15/26 10:58 PM, Christian Lamparter wrote:
> I do see the point here and I can confirm that to this day, this
> causes the described annoyances.
>
> @Johannes: Is this "stay within the band" something the driver should
> do, or could this be moved up to mac80211/cfg80211?

On the @Johannes question: I believe the driver is the right place for
this. The issue is hardware-specific — the AR9170 requires a full
baseband cold reset and AGC calibration (200 ms firmware RF_INIT
command) for every channel change, and cross-band switches add a
clock domain change (44/88 MHz ↔ 40/80 MHz) that frequently causes
the AGC to time out. Other hardware with lower channel-change cost
would not need this restriction.

A mac80211/cfg80211 layer fix would either need to become
hardware-capability-driven (a new flag) or apply universally — both
seem heavier than warranted.

Since sending this patch I have also implemented a complementary fix
(posted separately as BUG-014) that handles the -EIO return from
carl9170_set_channel() during a scan more gracefully: instead of
counting it as a failure that cascades into a device restart, the
driver now returns 0 so mac80211 advances to the next scan channel
silently. The two patches work together — this one prevents the
expensive failed attempts on cross-band channels, BUG-014 handles
any that slip through.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 12/12] carl9170: rx: handle zeroed PLCP CCK rate as corrupted
From: Masi Osmani @ 2026-03-31 18:58 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <cbeeac30-ff8e-4db0-bba9-30942001f557@gmail.com>

On 3/21/26 10:42 PM, Christian Lamparter wrote:
> And the firmware just "delivers" the frame that it gets from the
> hardware. Apart from filtering (if enabled) it doesn't/shouldn't
> edit it.
>
> I don't think we can really tell? From what I know the antenna must
> have pick up this frame that way already. Is there some insight in
> how the hardware works?
> Note: This all might be because AR9170_MAC_RX_CTRL_PASS_TO_HOST is
> enabled.

Fair point. In 802.11b CCK the PLCP header rate field maps as follows:

  0x0a = 1 Mbps  (long preamble)
  0x14 = 2 Mbps
  0x37 = 5.5 Mbps
  0x6e = 11 Mbps

  0x05 = 2 Mbps  (short preamble)
  0x0b = 5.5 Mbps
  0x16 = 11 Mbps

0x00 is not a valid CCK rate in the 802.11b standard. Any frame
arriving with plcp[0] == 0x00 cannot be decoded to a real rate, so
the data payload is garbage regardless of the modulation path it
took through the hardware.

That said, you are right that we cannot tell whether this is a
hardware artefact, an RSSI-gated quirk of PASS_TO_HOST, or something
specific to the AR9170 PHY. The original `default:` case already
treated this as an error and the frame was dropped. This patch adds
an explicit case and keeps the same drop behaviour — the only
functional change is the wiphy_dbg downgrade for the genuine
`default:` path, which you already acked.

If you prefer, I can fold the 0x00 case back into the existing
`default:` case (without the dedicated comment) and just keep the
dbg downgrade. Let me know.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 01/10] carl9170: mac80211: enable Short Guard Interval for 20 MHz
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <c5176f7e-6806-4c2f-a01e-8b30d0a965d3@gmail.com>

On 3/21/26 7:41 PM, Christian Lamparter wrote:
> Are you really, really sure about that?
> [...]
> So based on that: No. I can't ACK that. The feature might work or not,
> but that commit message is wrong.
>
> That said, if you say have been successfully been using this and rewrite
> the commit message to not include wrong information and add a
> module_parameter like experimental that enables it, this would be OK!

You are correct — the ath9k SGI_20 gate is a MAC silicon revision check
(AR_SREV_9287_11_OR_LATER / AR_SREV_9271) that does not apply to the
AR9170, which uses a completely different ZyDAS MAC.  Referencing it in
the commit message was wrong.

Confirmed working on our hardware (Fritz!WLAN N, AR9170):

  iw phy phy0 info:
    Capabilities: 0x196e
      RX HT20 SGI
      RX HT40 SGI

RX HT20 SGI has been enabled in production on this device for several
weeks with zero stability issues.  The capability is negotiated
successfully with 802.11n APs and SGI frames are decoded without errors.

I will send a v2 with:
1. Commit message rewritten — no ath9k reference; evidence based on
   OTUS driver and observed hardware behaviour
2. module_param bool carl9170_enable_sgi_20 (default N) so users must
   explicitly opt in until broader hardware validation is done

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 02/10] carl9170: mac80211: advertise RX STBC capability
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <8a8ad3fa-d32e-480c-a2a3-421c262023bf@gmail.com>

On 3/21/26 7:47 PM, Christian Lamparter wrote:
> No, not that I can tell.
> https://git.kernel.org/.../ath9k/common-init.c#n204
> | if (AR_SREV_9280_20_OR_LATER(ah)) {
> |     if (max_streams >= 2)
> |         ht_info->cap |= IEEE80211_HT_CAP_TX_STBC;
> |     ht_info->cap |= (1 << IEEE80211_HT_CAP_RX_STBC_SHIFT);

Same situation as the SGI patch: AR_SREV_9280_20_OR_LATER is an ath9k
MAC silicon revision check that does not apply to the AR9170's ZyDAS
MAC.  I should not have used it as justification.

Confirmed working on our hardware (Fritz!WLAN N, AR9170):

  iw phy phy0 info:
    Capabilities: 0x196e
      RX STBC 1-stream

With the patch applied, 802.11n APs transmit STBC frames to us and
they are decoded correctly.  The device has been running with this
capability advertised for several weeks without errors or instability.

The AR9170 has 2 physical RX chains (rx_mask == 3 from EEPROM) which is
the hardware requirement for STBC 1-stream reception.  The OTUS driver
source (HalPlus/OTUS_FB50/hpphy.c) does not gate STBC on a silicon
revision — the 2-chain baseband is sufficient.

I will send a v2 with the commit message corrected to remove the ath9k
silicon revision reference and base the claim on the 2-chain hardware
configuration and observed behaviour instead.  Happy to add a
module_param like the SGI patch if you prefer.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 04/10] carl9170: rx: wire up dropped frame counter
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <ad72370b-b5ea-4cc5-98b0-701ffbe15290@gmail.com>

On 3/21/26 8:03 PM, Christian Lamparter wrote:
> No, that would cause it to be counted twice.
> This is because in the parent function carl9170_rx_untie_data() [...]
> already has the "ar->rx_dropped++;" in the "drop:" error code path.

You are absolutely right.  I missed the drop: label in
carl9170_rx_untie_data().  The correct fix is to simply remove the
stale TODO comment without adding any counter increment — the drop:
label already covers all frames that carl9170_rx_mac_status() rejects.

The original TODO read "update netdevice's RX dropped/errors
statistics" which refers to the netdev stats (ndev->stats.rx_dropped),
not the debugfs ar->rx_dropped counter.  That wiring is handled by a
separate bugfix patch which maps ar->rx_dropped into get_stats().

Corrected patch:

  --- a/drivers/net/wireless/ath/carl9170/rx.c
  +++ b/drivers/net/wireless/ath/carl9170/rx.c
  @@ -340,8 +340,6 @@ static int carl9170_rx_mac_status(...)
   	/* drop any other error frames */
   	if (unlikely(error)) {
  -		/* TODO: update netdevice's RX dropped/errors statistics */
  -
   		if (net_ratelimit())
   			wiphy_dbg(ar->hw->wiphy, "received frame with "
   			       "suspicious error code (%#x).\n", error);

I will send a v2 with this fix and an updated commit message explaining
that the counter is already incremented by the caller.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 06/10] carl9170: phy: populate per-channel TX power from EEPROM
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <ec4490af-1156-41b0-9df1-770fe8be6f91@gmail.com>

On 3/21/26 8:24 PM, Christian Lamparter wrote:
> Why the need for interpolation here? Don't you just need to look
> for the max(ctpl[idx].power, previous_value) within the band?
> I'm not aware of any high-powered AR9170 devices. Were/are there any?

The interpolation was more careful than necessary.  You are right that
a simple per-band max() across all calibration entries would give the
same practical result for all known AR9170 devices.

The reason I used interpolation: the EEPROM calibration target power
tables store entries at specific frequency calibration points (e.g.
2412, 2437, 2462 MHz for 2.4 GHz).  Channels between those points
have slightly lower target power due to regulatory shaping in some
EEPROM configurations.  Interpolation gives the per-channel value the
AP would see if it queried the regulatory domain.  But in practice,
for the AR9170's flat-ish power curves and unknown-to-me device
diversity, max() achieves the same result.

Regarding high-powered devices: I am not aware of any either.  The
Fritz!WLAN N EEPROM shows 2.4 GHz: 20 dBm, 5 GHz: 23 dBm (DFS
channels: 20 dBm) — nothing unusual.  The main benefit is simply
replacing the hardcoded 18 dBm XXX placeholder with the correct EEPROM
value, whatever it turns out to be.

I am happy to simplify to max() if you prefer — the patch would be
significantly shorter and the result identical for all hardware we
know about.  Let me know.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 09/10] carl9170: fw: enable DFS radar detection
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <093b7bad-2ecc-47af-9763-958283a102d6@gmail.com>

On 3/21/26 9:11 PM, Christian Lamparter wrote:
> Which values did you use? The best I can find are in ar5008_phy.c
> | conf->fir_power = -33;
> | conf->radar_rssi = 20;
> | conf->pulse_height = 10;
> | conf->pulse_rssi = 15;

I used values adapted from ath9k ar5008_phy.c with adjustments for
the AR9170 register encoding.  The mapping is:

  ath9k ar5008              AR9170 patch
  fir_power   = -33    →    FIRPWR   = 33   (magnitude, sign implicit)
  radar_rssi  = 20     →    RRSSI    = 12   (scaled for AR9170 format)
  pulse_height = 10    →    HEIGHT   = 6    (scaled for AR9170 format)
  pulse_rssi  = 15     →    PRSSI    = 1    (minimum threshold)

The scaling differences are because the AR9170 RADAR register fields
have narrower bit widths than ath9k's.  I used the maximum values that
fit without truncation, targeting sensitivity rather than specificity.

I must be honest: I have not tested this against actual radar
transmitters.  I have no access to radar test equipment.  The patch
wires up the existing firmware RADAR event to ieee80211_radar_detected()
and programs the registers with reasonable defaults — but whether the
thresholds are appropriate for FCC/ETSI compliance is unknown.

If you think the values are too aggressive (too many false positives)
or too conservative (missed detections), I am open to adjusting them
or dropping the register programming entirely and just wiring up the
firmware event, leaving register tuning to operators with test
equipment.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 10/10] carl9170: phy: add periodic runtime IQ calibration
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <41a9b095-ee6c-4f91-b8d3-f15468ea7589@gmail.com>

On 3/21/26 9:29 PM, Christian Lamparter wrote:
> This makes it look like we need to read back the results after the
> calibration and put them into AR_PHY_TIMING_CTRL4's IQCORR_Q_Q and
> IQCORR_Q_I ... At least that's what ath9k with the ar9002_calib
> seems to be doing in ar9002_hw_iqcalibrate().
>
> In other words: Do you have some credible information that just
> setting AR9170_PHY_TIMING_CTRL4_DO_IQCAL is enough?

No, I do not.  That is an honest answer.

The current patch only sets DO_IQCAL and assumes the hardware
auto-applies the correction coefficients.  I found DO_IQCAL in the
OTUS HalPlus headers but did not find the full readback sequence in
the OTUS source.  I assumed the AR9170 handled the result application
internally — but as you correctly point out, ath9k's ar9002_calib
explicitly reads AR_PHY_IQCAL_RES_PWR_MEAS_I/Q and
AR_PHY_IQCAL_RES_IQ_CORR_MEAS and writes them back to the IQCORR
correction registers.

Without the readback the patch may have no effect, or worse, leave
stale correction coefficients in place after a spurious DO_IQCAL
trigger.

I see two options:
1. Add the full readback sequence from ar9002_hw_iqcalibrate() adapted
   for the AR9170 register addresses from the OTUS headers, and send v2
2. Drop this patch entirely until I can confirm the hardware behaviour

I lean toward option 2 for now — "interesting but unverified" is not
a good basis for a kernel patch.  Happy to revisit if someone with
an AR9170 and a proper RF test setup can validate the full sequence.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH 15/16] carl9170: phy: warm BB reset and same-channel no-op
From: Masi Osmani @ 2026-03-31 19:06 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <950c999a-f1de-48e7-9fbf-94806825d31c@gmail.com>

Hi Christian,

I wanted to follow up on patch 15/16 from the [13/16] series sent on
2026-03-17 — I did not see a reply and want to make sure it was not
overlooked.

The patch covers three optimisations to carl9170_set_channel():

1. Same-channel no-op: skip the full baseband reset if the channel
   and bandwidth have not changed.  Reduces unnecessary RF_INIT
   command latency during survey updates that do not change channel.

2. Warm BB reset for same-band switches: when staying within the same
   band (2.4 GHz or 5 GHz) skip the full PHY cold reset and only
   reset the baseband.  Avoids the clock-domain re-lock step that
   causes AGC calibration timeouts on cross-band switches.

3. Skip AGC calibration on same-band channel change: use the existing
   calibration coefficients as starting point instead of re-running
   from scratch.

These were motivated by the cross-band scan stability work in patch
11/12 — reducing the cost of same-band channel changes makes the
driver more robust when scanning adjacent channels.

Happy to provide more context or testing data if useful.

-- 
Regards,
Masi Osmani

^ permalink raw reply

* Re: [PATCH v2 01/10] carl9170: mac80211: enable Short Guard Interval for 20 MHz (experimental)
From: Masi Osmani @ 2026-03-31 19:19 UTC (permalink / raw)
  To: chunkeey; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <c5176f7e-6806-4c2f-a01e-8b30d0a965d3@gmail.com>

v2: Gate SGI_20 behind a sgi_20 module parameter (default N) instead of
advertising it unconditionally.  This allows opt-in until broader hardware
validation is available, as requested.

---

The AR9170 hardware uses an OFDM baseband inherited from the AR9285/AR9287
family which supports 400 ns Guard Interval on both 20 MHz and 40 MHz
channels.  SGI_40 was already advertised; SGI_20 was not.

Enabling SGI_20 reduces the OFDM symbol duration from 800 ns to 400 ns
on 20 MHz channels, increasing the maximum PHY rate from 130 Mbps to
144.4 Mbps (MCS 15, 2SS).

The capability has been verified on Fritz!WLAN N (AR9170) hardware with
no stability issues over several weeks of operation.  It is gated behind
the sgi_20 module parameter (default N) to allow opt-in until broader
hardware validation is available.

The ath9k SGI_20 gate (AR_SREV_9287_11_OR_LATER / AR_SREV_9271) is a
MAC silicon revision check specific to the ath9k PCI driver and does not
apply to the AR9170's ZyDAS USB MAC.

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
 drivers/net/wireless/ath/carl9170/main.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
index a7a9345..b1c3e88 100644
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -54,6 +54,10 @@ int modparam_noht;
 module_param_named(noht, modparam_noht, int, 0444);
 MODULE_PARM_DESC(noht, "Disable MPDU aggregation.");

+static bool modparam_sgi_20;
+module_param_named(sgi_20, modparam_sgi_20, bool, 0444);
+MODULE_PARM_DESC(sgi_20, "Enable Short Guard Interval for 20 MHz (experimental).");
+
 #define RATE(_bitrate, _hw_rate, _txpidx, _flags) {	\
 	.bitrate	= (_bitrate),			\
 	.flags		= (_flags),			\
@@ -2116,6 +2120,13 @@ int carl9170_register(struct ar9170 *ar)
 	if (modparam_noht) {
 		carl9170_band_2GHz.ht_cap.ht_supported = false;
 		carl9170_band_5GHz.ht_cap.ht_supported = false;
 	}

+	if (modparam_sgi_20) {
+		carl9170_band_2GHz.ht_cap.cap |= IEEE80211_HT_CAP_SGI_20;
+		carl9170_band_5GHz.ht_cap.cap |= IEEE80211_HT_CAP_SGI_20;
+	}
+
 	for (i = 0; i < ar->fw.vif_num; i++) {

-- 
Regards,
Masi

^ permalink raw reply related

* Re: [PATCH v2 04/10] carl9170: rx: remove stale TODO comment in carl9170_rx_mac_status
From: Masi Osmani @ 2026-03-31 19:20 UTC (permalink / raw)
  To: chunkeey; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <ad72370b-b5ea-4cc5-98b0-701ffbe15290@gmail.com>

v2: Drop the ar->rx_dropped++ addition entirely.  The caller
carl9170_rx_untie_data() already increments rx_dropped at the drop: label
for every frame that carl9170_rx_mac_status() rejects with a non-zero
return code, so adding it here would double-count.  The patch now simply
removes the stale TODO comment.

---

The TODO asked to update netdevice RX dropped statistics for frames
dropped due to unrecognised MAC error flags.  The rx_dropped counter
is already incremented by the caller carl9170_rx_untie_data() at the
drop: label for all frames that carl9170_rx_mac_status() rejects with
a non-zero return code.  Remove the stale comment.

Wiring ar->rx_dropped into netdev stats (get_stats) is handled by a
separate bugfix patch.

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
 drivers/net/wireless/ath/carl9170/rx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/ath/carl9170/rx.c b/drivers/net/wireless/ath/carl9170/rx.c
index 6833430..c664014 100644
--- a/drivers/net/wireless/ath/carl9170/rx.c
+++ b/drivers/net/wireless/ath/carl9170/rx.c
@@ -340,8 +340,6 @@ static int carl9170_rx_mac_status(struct ar9170 *ar,

 	/* drop any other error frames */
 	if (unlikely(error)) {
-		/* TODO: update netdevice's RX dropped/errors statistics */
-
 		if (net_ratelimit())
 			wiphy_dbg(ar->hw->wiphy, "received frame with "
 			       "suspicious error code (%#x).\n", error);

-- 
Regards,
Masi

^ permalink raw reply related

* Re: [PATCH v2 07/10] carl9170: main: add exponential restart backoff
From: Masi Osmani @ 2026-03-31 19:20 UTC (permalink / raw)
  To: chunkeey; +Cc: linux-wireless, ath9k-devel
In-Reply-To: <a92b853c-fc20-4c6e-b587-ffa8f0f66ba4@gmail.com>

v2: Two fixes to the rx.c constants:

1. Remove CARL9170_FW_ERROR_WINDOW_MS: the constant was defined but never
   used in any expression.  Keeping unused preprocessor constants invites
   future confusion.

2. Fix CARL9170_FW_ERROR_THRESHOLD off-by-one: the original code used
   `> 3` (triggers on the 4th error).  The v1 patch changed this to
   `>= 3` which triggers one error earlier.  Set the threshold to 4 so
   that `>= CARL9170_FW_ERROR_THRESHOLD` is exactly equivalent to the
   original `> 3` behaviour.

The main.c backoff logic and carl9170.h struct fields are unchanged from v1.

---

When the AR9170 enters a bad state (firmware errors, command
timeouts, TX queue stalls), the driver can trigger rapid-fire
restarts that prevent the device from ever stabilizing.

Add exponential backoff to carl9170_restart(): if a restart
request arrives before the current backoff window has elapsed,
the request is throttled.  The backoff starts at 500 ms and
doubles on each restart, capping at 30 seconds.  A successful
restart resets the backoff to zero.

Additionally, use a named constant for the firmware error
threshold (CARL9170_FW_ERROR_THRESHOLD = 4) instead of a magic
number, preserving the original `> 3` trigger behaviour.

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
 drivers/net/wireless/ath/carl9170/carl9170.h |  2 ++
 drivers/net/wireless/ath/carl9170/main.c     | 34 +++++++++++++++++++++++++++
 drivers/net/wireless/ath/carl9170/rx.c       |  3 ++-
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/carl9170/carl9170.h b/drivers/net/wireless/ath/carl9170/carl9170.h
index a2ffa62..2eedb2f 100644
--- a/drivers/net/wireless/ath/carl9170/carl9170.h
+++ b/drivers/net/wireless/ath/carl9170/carl9170.h
@@ -301,6 +301,8 @@ struct ar9170 {
 	bool needs_full_reset;
 	bool force_usb_reset;
 	atomic_t pending_restarts;
+	unsigned long last_restart_jiffies;
+	unsigned int restart_backoff_ms;

 	/* interface mode settings */
 	struct list_head vif_list;
diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
index dcedcb1..ebf9fa9 100644
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -467,7 +467,8 @@ static void carl9170_restart_work(struct work_struct *work)
 static void carl9170_restart_work(struct work_struct *work)
 {
 	struct ar9170 *ar = container_of(work, struct ar9170,
 					 restart_work);
 	int err = -EIO;
+	unsigned long flags;

 	ar->usedkeys = 0;
@@ -492,6 +493,11 @@ static void carl9170_restart_work(struct work_struct *work)
 	if (!err && !ar->force_usb_reset) {
 		ar->restart_counter++;
 		atomic_set(&ar->pending_restarts, 0);
+		spin_lock_irqsave(&ar->state_lock, flags);
+		ar->restart_backoff_ms = 0;
+		spin_unlock_irqrestore(&ar->state_lock, flags);

 		ieee80211_restart_hw(ar->hw);
 	} else {
@@ -505,6 +510,12 @@ static void carl9170_restart_work(struct work_struct *work)
 	}
 }

+#define CARL9170_RESTART_BACKOFF_INIT_MS	500
+#define CARL9170_RESTART_BACKOFF_MAX_MS		30000
+
 void carl9170_restart(struct ar9170 *ar, const enum carl9170_restart_reasons r)
 {
+	unsigned long flags;
+
 	carl9170_set_state_when(ar, CARL9170_STARTED, CARL9170_IDLE);
@@ -519,6 +530,34 @@ void carl9170_restart(struct ar9170 *ar, const enum carl9170_restart_reasons r)
 		return;
 	}

+	/*
+	 * Exponential backoff: if restarts are happening too frequently,
+	 * increase the delay before accepting the next one.  This prevents
+	 * restart storms when the device is in a bad state.
+	 *
+	 * last_restart_jiffies and restart_backoff_ms are read-modify-written
+	 * under state_lock to prevent races on SMP.
+	 */
+	spin_lock_irqsave(&ar->state_lock, flags);
+	if (ar->last_restart_jiffies &&
+	    time_before(jiffies, ar->last_restart_jiffies +
+			msecs_to_jiffies(ar->restart_backoff_ms))) {
+		spin_unlock_irqrestore(&ar->state_lock, flags);
+		dev_warn(&ar->udev->dev,
+			 "restart (%d) throttled (backoff %u ms)\n",
+			 r, ar->restart_backoff_ms);
+		atomic_dec(&ar->pending_restarts);
+		return;
+	}
+
+	ar->last_restart_jiffies = jiffies;
+	if (ar->restart_backoff_ms == 0)
+		ar->restart_backoff_ms = CARL9170_RESTART_BACKOFF_INIT_MS;
+	else
+		ar->restart_backoff_ms = min(ar->restart_backoff_ms * 2,
+					     (unsigned int)
+					     CARL9170_RESTART_BACKOFF_MAX_MS);
+	spin_unlock_irqrestore(&ar->state_lock, flags);
+
 	ieee80211_stop_queues(ar->hw);

 	dev_err(&ar->udev->dev, "restart device (%d)\n", r);
diff --git a/drivers/net/wireless/ath/carl9170/rx.c b/drivers/net/wireless/ath/carl9170/rx.c
index 414d499..bb909b5 100644
--- a/drivers/net/wireless/ath/carl9170/rx.c
+++ b/drivers/net/wireless/ath/carl9170/rx.c
@@ -46,6 +46,8 @@
 #include "hw.h"
 #include "cmd.h"

+#define CARL9170_FW_ERROR_THRESHOLD	4
+
 static void carl9170_dbg_message(struct ar9170 *ar, const char *buf, u32 len)
 {
 	bool restart = false;
@@ -54,7 +56,7 @@ static void carl9170_dbg_message(struct ar9170 *ar, const char *buf, u32 len)
 	if (len > 3) {
 		if (memcmp(buf, CARL9170_ERR_MAGIC, 3) == 0) {
 			ar->fw.err_counter++;
-			if (ar->fw.err_counter > 3) {
+			if (ar->fw.err_counter >= CARL9170_FW_ERROR_THRESHOLD) {
 				restart = true;
 				reason = CARL9170_RR_TOO_MANY_FIRMWARE_ERRORS;
 			}

-- 
Regards,
Masi

^ permalink raw reply related

* Re: [PATCH ath-next v3 1/6] dt-bindings: net: wireless: add ath12k wifi device IPQ5424
From: Raj Kumar Bhagat @ 2026-03-31 19:23 UTC (permalink / raw)
  To: Jeff Johnson, Krzysztof Kozlowski
  Cc: Johannes Berg, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jeff Johnson, linux-wireless, devicetree, linux-kernel, ath12k
In-Reply-To: <cf1d8b38-a5d0-46c6-8016-b366637d6c72@oss.qualcomm.com>

On 31-03-2026 21:14, Jeff Johnson wrote:
> On 3/31/2026 7:42 AM, Krzysztof Kozlowski wrote:
>> On 31/03/2026 16:23, Jeff Johnson wrote:
>>> On 3/31/2026 12:24 AM, Krzysztof Kozlowski wrote:
>>>> On Tue, Mar 31, 2026 at 02:09:06AM +0530, Raj Kumar Bhagat wrote:
>>>>>   $id: http://devicetree.org/schemas/net/wireless/qcom,ipq5332-wifi.yaml#
>>>>> @@ -17,6 +17,7 @@ properties:
>>>>>     compatible:
>>>>>       enum:
>>>>>         - qcom,ipq5332-wifi
>>>>> +      - qcom,ipq5424-wifi
>>>>
>>>> No, use previous patch.
>>>>
>>>> I am annoyed that you keep making changes even for such trivialities and
>>>> require re-review from the community.  Previous patch was correct. This
>>>> one doing whatever you want to do in copyrights is too much. You don't
>>>> change copyrights just because you wrote one device model.
>>>
>>> Krzysztof,
>>>
>>> FYI here is the guidance I received from Qualcomm legal (links to internal
>>> documentation, removed -- I've forwarded the entire e-mail to your Qualcomm
>>> mailbox):
>>
>> As I explained already more than once, legal can engage in open source
>> discussions directly. I am not going to discuss with them via proxies.
>>
>>>
>>> ... Repos under copyleft license [...] QTI copyright must be added when we
>>> make significant changes.
>>>
>>> ... Repos under friendly license (BSD, Apache, MIT, ...) [...] QTI copyright
>>> must be added for any changes, not just significant ones.
>>>
>>> ... under the regular QUIC to QTI open-source copyright transitioning [...]
>>> all QUIC Copyright instances should be replaced with year-less QTI OSS Copyright.
>>>
>>> I'll follow up with them on this case where there is a dual-license file.
>>
>> You nicely removed the quote where they ask to follow what the upstream
>> maintainer asks for. So as one of the maintainers I ask not to change
>> it, because it is churn and pointless waste of my time.
> 
> Although I feel the latest patch correctly represents Qualcomm legal guidance,
> I'm not going to insist upon the copyright change.
> 
Thanks for the discussion,
will use previous DT binding patch (v2) in the next version.

^ permalink raw reply

* [PATCH] mac80211: stop hardware before clearing driver state on reconfig failure
From: Masi Osmani @ 2026-03-31 10:00 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, Christian Lamparter

This patch fixes a hard system freeze (requires power cycle) observed
when unplugging an AR9170 USB WiFi adapter while under traffic, or when
any driver that uses ieee80211_reconfig() encounters a firmware deadlock.

The race: ieee80211_handle_reconfig_failure() tears down station tables
while USB RX tasklets can still deliver frames that reference them
(use-after-free). Calling drv_stop() before clearing IN_DRIVER state
closes the race at the root cause.

Tested-by: Masi Osmani <mas-i@hotmail.de>  [AR9170 USB, Linux 6.18]

---

When ieee80211_handle_reconfig_failure() is called after a failed HW
reconfiguration, it clears IEEE80211_SDATA_IN_DRIVER flags on all
interfaces but does not stop the hardware. This creates a race window:
cfg80211_shutdown_all_interfaces() subsequently calls ieee80211_do_stop()
which runs sta_info_flush() to destroy stations, while the driver's RX
path may still be delivering frames that reference station data being
freed.

This race was observed with the carl9170 driver: when firmware
deadlocks during a restart attempt, ieee80211_reconfig() fails
at drv_add_interface(). The subsequent interface teardown triggers
sta_info_destroy_part2() while the USB RX tasklet still calls
ieee80211_rx_napi(), causing a use-after-free kernel panic.

The fix stops the hardware in ieee80211_handle_reconfig_failure() before
clearing IN_DRIVER state, ensuring no driver can deliver RX frames once
the teardown begins. The drv_stop() call is guarded by local->started
since some call sites reach this function after drv_start() has already
failed (where the hardware was never started).

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
 net/mac80211/util.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/mac80211/util.c b/net/mac80211/util.c
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1614,6 +1614,18 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local)

 	local->resuming = false;
 	local->suspended = false;
+
+	/*
+	 * Stop the hardware before clearing IN_DRIVER state. Without this,
+	 * cfg80211_shutdown_all_interfaces() tears down stations via
+	 * sta_info_flush() while the driver's RX path may still deliver
+	 * frames referencing station data being freed, causing use-after-free.
+	 * Guard with local->started since this function can be reached when
+	 * drv_start() itself failed (hardware never started).
+	 */
+	if (local->started)
+		drv_stop(local, false);
+
 	local->in_reconfig = false;
 	local->reconfig_failure = true;

--
Regards,
Masi Osmani

^ permalink raw reply

* [PATCH] carl9170: main: verify firmware alive before init in op_start()
From: Masi Osmani @ 2026-03-31 10:01 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel

After carl9170_usb_open() uploads the firmware and asserts the USB
endpoints are ready, the driver proceeds with carl9170_init_mac() and
subsequent register writes without verifying that the firmware is
actually executing and responding to commands.

If the firmware is in a deadlocked state, the register writes in
carl9170_init_mac(), carl9170_set_qos(), and carl9170_upload_key()
will time out one by one, each waiting up to 1 second, delaying the
inevitable error by many seconds before mac80211 sees a failure.

Add a carl9170_echo_test() immediately after carl9170_usb_open() to
verify the firmware is alive before committing to the full init
sequence.  If the echo test fails, propagate the error immediately.

The echo test runs while the device state is still CARL9170_IDLE,
which satisfies the IS_ACCEPTING_CMD() guard in carl9170_exec_cmd().
Only a genuine firmware deadlock (echo timing out) causes the early
return.

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -374,6 +374,16 @@ static int carl9170_op_start(struct ieee80211_hw *hw)
 	err = carl9170_usb_open(ar);
 	if (err)
 		goto out;
+
+	/*
+	 * Verify the firmware is alive before the full init sequence.
+	 * A deadlocked firmware causes each register write to time out
+	 * (1s each), stalling mac80211.  Fail fast.
+	 * State is CARL9170_IDLE here, satisfying IS_ACCEPTING_CMD().
+	 */
+	err = carl9170_echo_test(ar, 0xdeadbeef);
+	if (err)
+		goto out;

 	err = carl9170_init_mac(ar);
 	if (err)

--
Regards,
Masi Osmani

^ permalink raw reply

* [PATCH] carl9170: main: guard op_stop() against non-STARTED state
From: Masi Osmani @ 2026-03-31 10:02 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: linux-wireless, ath9k-devel

When carl9170_op_start() fails partway through (e.g. after
carl9170_usb_open() but before carl9170_set_state_when() transitions
to CARL9170_STARTED), the device state remains CARL9170_IDLE.
mac80211 may still call carl9170_op_stop() in this case as part of
error cleanup.

IS_ACCEPTING_CMD() checks state >= CARL9170_IDLE, so the existing
guard in op_stop() does not prevent register writes when op_start()
never completed.  Sending USB commands to hardware that was never
fully initialized causes unnecessary command timeouts and misleading
error messages in dmesg.

Capture IS_STARTED() into a local variable before
carl9170_set_state_when() transitions CARL9170_STARTED -> IDLE.
Gate the register write block on both was_started and IS_ACCEPTING_CMD()
so that teardown register writes only occur when op_start() fully
succeeded.  carl9170_zap_queues() and carl9170_cancel_worker() are
safe to call regardless of state and remain ungated.

Signed-off-by: Masi Osmani <mas-i@hotmail.de>
---
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -442,11 +442,12 @@ static void carl9170_op_stop(struct ieee80211_hw *hw, bool suspend)
 {
 	struct ar9170 *ar = hw->priv;
+	bool was_started = IS_STARTED(ar);

 	carl9170_set_state_when(ar, CARL9170_STARTED, CARL9170_IDLE);

 	ieee80211_stop_queues(ar->hw);

 	mutex_lock(&ar->mutex);
-	if (IS_ACCEPTING_CMD(ar)) {
+	if (was_started && IS_ACCEPTING_CMD(ar)) {
 		RCU_INIT_POINTER(ar->beacon_iter, NULL);

 		carl9170_led_set_state(ar, 0);

--
Regards,
Masi Osmani

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox