public inbox for ath12k@lists.infradead.org
 help / color / mirror / Atom feed
From: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
To: Saikiran B <bjsaikiran@gmail.com>
Cc: ath12k@lists.infradead.org, linux-wireless@vger.kernel.org,
	kvalo@kernel.org
Subject: Re: [PATCH v2 2/2] wifi: ath12k: Fix firmware stats leak when pdev list is empty
Date: Fri, 30 Jan 2026 10:09:03 +0800	[thread overview]
Message-ID: <fbcbeb0f-c073-4da5-9dbe-2518a1d31f20@oss.qualcomm.com> (raw)
In-Reply-To: <CAAFDt1s_NtY1vXa5STRW7oQn9yDJBC0g7CPSZXn0tFhd+CSHrQ@mail.gmail.com>



On 1/29/2026 10:06 PM, Saikiran B wrote:
> On Thu, Jan 29, 2026 at 7:57 AM Baochen Qiang
> <baochen.qiang@oss.qualcomm.com> wrote:
>>
>>
>>
>> On 1/27/2026 12:40 PM, Saikiran B wrote:
>>> I have analyzed the logs and code flow in depth to provide more
>>> definitive answers for your questions.
>>>
>>> The log entries showing the failure are:
>>> [  563.574076] ath12k_pci 0004:01:00.0: failed to pull fw stats: -71
>>> [  564.575896] ath12k_pci 0004:01:00.0: time out while waiting for get fw stats
>>>
>>> 1. Why are other stats populated?
>>> The "failed to pull fw stats: -71" error is not the initial failure
>>> but a symptom that appears after repeated operations. The leak happens
>>> during *successful* calls prior to this error.
>>>
>>> Code flow proving the leak:
>>> - ath12k_mac_get_fw_stats() sends WMI_REQUEST_PDEV_STAT.
>>> - Firmware responds. ath12k_update_stats_event() parses the response.
>>> - ath12k_wmi_fw_stats_process() is called, which splices 'vdevs' and
>>> 'beacon' stats into ar->fw_stats.vdevs/bcn.
>>> - ath12k_mac_get_fw_stats() returns 0 (Success).
>>> - In ath12k_mac_op_get_txpower(), the check `if (!pdev)` fails if the
>>> pdev-specific list is empty (but vdev list is NOT empty).
>>> - The function exits via `err_fallback` WITHOUT calling ath12k_fw_stats_reset().
>>> - Result: The 'vdev' and 'beacon' stats that were spliced into
>>> ar->fw_stats remain there, leaking memory and accumulating with every
>>> call.
>>>
>>> 2. Exact place where -71 is printed:
>>> The error "failed to pull fw stats: -71" is printed in
>>> [ath12k_update_stats_event()](drivers/net/wireless/ath/ath12k/wmi.c).
>>> It corresponds to "ret = ath12k_wmi_pull_fw_stats()" returning -EPROTO.
>>> This propagates from
>>> [ath12k_wmi_tlv_fw_stats_data_parse()](drivers/net/wireless/ath/ath12k/wmi.c),
>>> when buffer validation checks (like `len < sizeof(*src)`) fail.
>>>
>>> Conclusion:
>>> The fix in my patch (resetting stats when `!pdev`) is critical because
>>> it ensures that the accumulated 'vdev' and 'beacon' stats are freed
>>> even when the 'pdev' list ends up empty.
>>>
>>> Let me know if you need anything else.
>>
>> can you please try below to see if it can fix your issue?
>>
>> https://lore.kernel.org/r/20260129-ath12k-fw-stats-fixes-v1-0-55d66064f4d5@oss.qualcomm.com
>>
>>>
>>> Thanks & Regards,
>>> Saikiran
>>>
>>> On Tue, Jan 27, 2026 at 9:47 AM Saikiran B <bjsaikiran@gmail.com> wrote:
>>>>
>>>> Hi Baochen,
>>>>
>>>> Regarding your questions:
>>>>
>>>> "Are other stats populated?"
>>>>
>>>> Yes. When ath12k_mac_get_fw_stats() returns success (0), it means the
>>>> firmware response was received and valid WMI events were processed.
>>>> The firmware response to WMI_REQUEST_PDEV_STAT typically includes
>>>> multiple stats TLVs (vdev stats, beacon stats, etc.). Even if the
>>>> "pdev stats" list ends up empty (e.g., due to specific filtering or
>>>> availability), the firmware should have populated other lists (like
>>>> vdevs or beacons) in the ar->fw_stats structure. If we don't reset,
>>>> these valid entries leak and accumulate.
>>>>
>>>> "Where exactly is -71 (EPROTO) printed?"
>>>>
>>>> The log "failed to pull fw stats: -71" is printed in
>>>> ath12k_update_stats_event() (wmi.c line 8500 in my tree). This error
>>>> code (-EPROTO) propagates from ath12k_wmi_tlv_fw_stats_data_parse(),
>>>> where it is returned when buffer validation checks fail (e.g., if (len
>>>> < sizeof(*src))). This failure suggests that the accumulated state or
>>>> memory corruption from the leak eventually causes the parser to fail
>>>> on subsequent events.
>>>>
>>>> So, fixing the leak is necessary for correctness regardless of the
>>>> specific side-effect error code.
>>>>
>>>> Thanks & Regards,
>>>> Saikiran
>>>>
>>>> On Tue, Jan 27, 2026 at 8:57 AM Baochen Qiang
>>>> <baochen.qiang@oss.qualcomm.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 1/26/2026 5:52 PM, Saikiran wrote:
>>>>>> The commits bd6ec8111e65 and 2977567b244f changed firmware stats handling
>>>>>> to be caller-driven, requiring explicit ath12k_fw_stats_reset() calls
>>>>>> after using ath12k_mac_get_fw_stats().
>>>>>>
>>>>>> In ath12k_mac_op_get_txpower(), when ath12k_mac_get_fw_stats() succeeds
>>>>>> but the pdev stats list is empty, the function exits without calling
>>>>>> ath12k_fw_stats_reset(). Even though the pdev list is empty, the firmware
>>>>>> may have populated other stats lists (vdevs, beacons, etc.) in the
>>>>>
>>>>> 'may' is not enough, we need to be 100% sure whether other stats are populated. This is
>>>>> critical for us to find the root cause.
>>>>>
>>>>>> ar->fw_stats structure.
>>>>>>
>>>>>> Without resetting the stats buffer, this data accumulates across multiple
>>>>>> calls, eventually causing the stats buffer to overflow and leading to
>>>>>> firmware communication failures (error -71/EPROTO) during subsequent
>>>>>> operations.
>>>>>>
>>>>>> The issue manifests during 5GHz scanning which triggers multiple TX power
>>>>>> queries. Symptoms include:
>>>>>> - "failed to pull fw stats: -71" errors in dmesg
>>>>>
>>>>> still, can you please check the logs to see at which exact place is this printed?
>>>>>
>>>>>> - 5GHz networks not detected despite hardware support
>>>>>> - 2.4GHz networks work normally
>>>>>>
>>>>>> Fix by calling ath12k_fw_stats_reset() when the pdev list is empty,
>>>>>> ensuring the stats buffer is properly cleaned up even when only partial
>>>>>> stats data is received from firmware.
>>>>>>
>>>>>> Fixes: bd6ec8111e65 ("wifi: ath12k: Make firmware stats reset caller-driven")
>>>>>> Link: https://bugs.launchpad.net/ubuntu-concept/+bug/2138308
>>>>>> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302 (Lenovo Yoga Slim 7x)
>>>>>> Signed-off-by: Saikiran <bjsaikiran@gmail.com>
>>>>>> ---
>>>>>>  drivers/net/wireless/ath/ath12k/mac.c | 1 +
>>>>>>  1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
>>>>>> index e0e49f782bf8..6e35c3ee9864 100644
>>>>>> --- a/drivers/net/wireless/ath/ath12k/mac.c
>>>>>> +++ b/drivers/net/wireless/ath/ath12k/mac.c
>>>>>> @@ -5169,6 +5169,7 @@ static int ath12k_mac_op_get_txpower(struct ieee80211_hw *hw,
>>>>>>                                       struct ath12k_fw_stats_pdev, list);
>>>>>>       if (!pdev) {
>>>>>>               spin_unlock_bh(&ar->data_lock);
>>>>>> +             ath12k_fw_stats_reset(ar);
>>>>>>               goto err_fallback;
>>>>>>       }
>>>>>>
>>>>>
>>
> 
> Hi Baochen,
> 
> I tried applying your patches on top of v6.19-rc7 (which is the latest
> mainline release candidate I'm testing on), but I ran into build
> issues because some of the dependencies seem missing.
> 
> Specifically:
> Patch 2 ("wifi: ath12k: fix station lookup failure when disconnecting
> from AP") uses `ath12k_link_sta_find_by_addr()`, which does not exist
> in my tree. It seems your patches are based on a different tree
> (ath-next?) that has newer changes not yet in the mainline.
> 
> Could you please point me to the specific git repo/branch you are
> using? I can try to build and test on that baseline to be sure.

My bad. Please try the latest ath tree:

https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git/

the base commit is ath-202601271544 tag.

> 
> Regarding the firmware stats issue:
> I verified the firmware files match the latest available (MD5 sums
> matched), yet the "-71" errors and memory leak persist on my device
> without fixes.
> 
> I successfully applied the logic from your Patch 1 manually (since
> [ath12k_mac_get_target_pdev_id](cci:1://file:///home/saikiran/linux/kernel/x1e/x1e/drivers/net/wireless/ath/ath12k/mac.c:989:0-1008:1)
> exists), but I haven't fully validated if it alone resolves the leak
> in all scenarios.
> 
> However, the fix I proposed in my v2 patch (resetting stats when pdev
> list is empty) definitely stops the leak mechanism by ensuring cleanup
> happens even when the firmware returns partial stats (which seems to
> be the trigger condition).
> 
> I'll wait for your pointer to the base tree to do a proper test of your series.
> 
> Thanks & Regards,
> Saikiran



  reply	other threads:[~2026-01-30  2:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  9:52 [PATCH v2 0/2] wifi: ath12k: Fix 5GHz issues on WCN7850 Saikiran
2026-01-26  9:52 ` [PATCH v2 1/2] wifi: ath12k: Remove frequency range filtering for single-phy devices Saikiran
2026-01-27  3:22   ` Baochen Qiang
2026-01-27  4:04     ` Saikiran B
2026-01-27  7:41       ` Baochen Qiang
2026-01-27  8:58         ` Saikiran B
2026-01-27 10:21           ` Baochen Qiang
2026-01-27 19:06             ` Saikiran B
2026-01-26  9:52 ` [PATCH v2 2/2] wifi: ath12k: Fix firmware stats leak when pdev list is empty Saikiran
2026-01-27  3:27   ` Baochen Qiang
2026-01-27  4:17     ` Saikiran B
2026-01-27  4:40       ` Saikiran B
2026-01-29  2:27         ` Baochen Qiang
2026-01-29 14:06           ` Saikiran B
2026-01-30  2:09             ` Baochen Qiang [this message]
2026-01-30  7:32               ` Saikiran B
2026-01-30  7:50                 ` Baochen Qiang
2026-01-30 14:27                 ` Jeff Johnson
2026-01-30 16:45                   ` Saikiran B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fbcbeb0f-c073-4da5-9dbe-2518a1d31f20@oss.qualcomm.com \
    --to=baochen.qiang@oss.qualcomm.com \
    --cc=ath12k@lists.infradead.org \
    --cc=bjsaikiran@gmail.com \
    --cc=kvalo@kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox